Home > mailing lists

[HACKERS] Performance degradation in TPC-H Q18 - Mailing list pgsql-hackers

From	Kuntal Ghosh
Subject	[HACKERS] Performance degradation in TPC-H Q18
Date	February 22, 2017 08:53:07
Msg-id	CAGz5QC+=fNTYgzMLTBUNeKt6uaWZFXJbkB5+7oWm-n9DwVxcLA@mail.gmail.com Whole thread Raw
Responses	Re: [HACKERS] Performance degradation in TPC-H Q18
List	pgsql-hackers

Tree view

Hello everyone,

While conducting the experiments for parallelism, Rafia came across a
hang in Q18 when plan uses partial and finalize hash aggregate. This
could be seen on both scale factors - 20 and 300, on setting work_mem
high enough so that the query uses hash aggregate. It seems that
commit b81b5a96f424531b97cdd1dba97d9d1b9c9d372e does not solve the
issue completely.

I've tested on scale factor 10 and work_mem 100mb and the issue is
reproducible. Below is the simplified part of the query which uses the
hash aggregate.

tpch10GB=# explain select l_orderkey from lineitem group by l_orderkey;
                                          QUERY PLAN
------------------------------------------------------------------------------------------------Finalize HashAggregate
(cost=1612872.73..1616801.32rows=392859 width=8)  Group Key: l_orderkey  ->  Gather  (cost=1528408.04..1610908.43
rows=785718width=8)        Workers Planned: 2        ->  Partial HashAggregate  (cost=1527408.04..1531336.63
 
rows=392859 width=8)              Group Key: l_orderkey              ->  Parallel Seq Scan on lineitem
(cost=0.00..1464922.43 rows=24994243 width=8)
(7 rows)

I've also tried to execute the same query with a different column name.

tpch10GB=# explain analyze select l_suppkey from lineitem group by l_suppkey;
             QUERY PLAN
 

-----------------------------------------------------------------------------------------------------------------------------------------------------Finalize
HashAggregate (cost=1549511.80..1550493.37 rows=98157 width=8)               (actual time=22363.901..22380.361
rows=100000loops=1)  Group Key: l_suppkey  ->  Gather  (cost=1528408.04..1549021.01 rows=196314 width=8)
(actualtime=22107.387..22244.848 rows=300000 loops=1)        Workers Planned: 2        Workers Launched: 2        ->
PartialHashAggregate  (cost=1527408.04..1528389.61
 
rows=98157 width=8)                   (actual time=22100.776..22135.063 rows=100000 loops=3)              Group Key:
l_suppkey             ->  Parallel Seq Scan on lineitem
 
(cost=0.00..1464922.43 rows=24994243 width=8)                           (actual time=0.846..10664.258 rows=19995351
loops=3)Planningtime: 0.171 msExecution time: 22393.030 ms
 
(10 rows)

And, it finished in time although the stats in both the plans are
exactly same. Hence, the problem is with the query itself. As Robert
suggested offline, I've produced the perf report for the query.

99.44%     0.01%  postgres  postgres           [.] LookupTupleHashEntry
74.31%    53.55%  postgres  postgres           [.] tuplehash_insert
17.05%    17.04%  postgres  libc-2.17.so       [.] __memcpy_ssse3_back
11.06%    11.05%  postgres  postgres           [.] tuplehash_prev
10.81%    10.80%  postgres  postgres           [.] tuplehash_next

It's clear from the perf report that something terrible is going on in
the simple hash insert. I've enabled the logging for hash
ops(SH_STAT()) during hash growing phase.

2017-02-21 16:42:39.634 IST [9372] LOG:  size: 1048576, members:
838860, filled: 0.799999,
total chain: 2008236860, max chain: 66743, avg chain: 2394.007176,
total_collisions: 278348, max_collisions: 8, avg_collisions: 0.331817
2017-02-21 16:42:39.634 IST [9372] STATEMENT:  explain analyze select
l_orderkey from lineitem group by l_orderkey;

2017-02-21 16:42:39.741 IST [9372] LOG:  size: 2097152, members:
838860, filled: 0.400000,
total chain: 788175, max chain: 124, avg chain: 0.939579,
total_collisions: 216052, max_collisions: 6, avg_collisions: 0.257554
2017-02-21 16:42:39.741 IST [9372] STATEMENT:  explain analyze select
l_orderkey from lineitem group by l_orderkey;

The value of max chain length is really high which hampers the query's
performance. To further understand the scenario, I've taken the
following results:

tpch10GB=# select count(distinct l_orderkey) from lineitem; count
----------15000000

tpch10GB=# select correlation from pg_stats where tablename='lineitem'
and attname='l_orderkey';correlation
-------------          1

tpch10GB=# select count(distinct l_suppkey) from lineitem;count
--------100000

tpch10GB=# select correlation from pg_stats where tablename='lineitem'
and attname='l_suppkey';correlation
------------- 0.00900259

It seems that l_orderkey produces a pretty bad case for hashing with
linear probing and a bad choice of hash function(as linear probing is
sensitive to elements with nearby hash codes). In [1], Andres's
proposed a hack to resize the hashtable when the chain is growing
quickly. Below is a comment from the patch explaining the objective:

+            /*
+             * To avoid negative consequences from overly imbalanced
+             * hashtables, grow the hashtable if collisions lead to large
+             * runs. The most likely cause of such imbalance is filling a
+             * (currently) small table, from a currently big one, in
+             * hash-table order.
+             *
+             * FIXME: compute boundaries in a more principled manner.
+             */

After applying the patch, TPC-H Q18 and the simplified query, both
completes in time. Although the patch works in this scenario, we need
a smarter way to trigger the hash expansion. The problem with the
patch is that it triggers a hash expansion even when the filled
percentage is pretty low, causing unnecessary memory consumption.

Thoughts?

[1] https://www.postgresql.org/message-id/20161123083351.5vramz52nmdokhzz%40alap3.anarazel.de
0001-Resize-simplehash.h-table-in-case-of-long-runs.patch


-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Jim Nasby
Date: 22 February 2017, 08:43:16
Subject: Re: [HACKERS] mat views stats

From: Jim Nasby
Date: 22 February 2017, 08:55:53
Subject: Re: [HACKERS] Change in "policy" on dump ordering?

[HACKERS] Performance degradation in TPC-H Q18 - Mailing list pgsql-hackers

Previous

Next