Macro customizable hashtable / bitmapscan & aggregation perf - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Macro customizable hashtable / bitmapscan & aggregation perf |
Date | |
Msg-id | 20160727004333.r3e2k2y6fvk2ntup@alap3.anarazel.de Whole thread Raw |
Responses |
Re: Macro customizable hashtable / bitmapscan & aggregation perf
Re: Macro customizable hashtable / bitmapscan & aggregation perf |
List | pgsql-hackers |
Hi, As previously mentioned, dynahash isn't particularly fast. In many cases that doesn't particularly matter. But e.g. hash aggregation and bitmap index scans are quite sensitive to hash performance. The biggest problems with dynahash (also discussed at [1]) are a) two level structure doubling the amount of indirect lookups b) indirect function calls c) using separate chaining based conflict resolution d) being too general. In the post referenced above I'd implemented an open-coded hashtable addressing these issues for the tidbitmap.c case, with quite some success. It's not particularly desirable to have various slightly differing hash-table implementations across the backend though. The only solution I could think of, that preserves the efficiency, is to have a hash-table implementation which is customizable to the appropriate types et, via macros. In the attached patch I've attached simplehash.h, which can be customized by a bunch of macros, before being inlined. There's also a patch using this for tidbitmap.c and nodeAgg/nodeSubplan/... via execGrouping.c. To show the motivation: Bitmapscan: before: tpch[22005][1]=# EXPLAIN ANALYZE SELECT SUM(l_extendedprice) FROM lineitem WHERE (l_shipdate >= '1995-01-01'::date) AND (l_shipdate<= '1996-12-31'::date); ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Aggregate (cost=942330.27..942330.28 rows=1 width=8) (actual time=5283.026..5283.026 rows=1 loops=1) │ │ -> Bitmap Heap Scan on lineitem (cost=193670.02..919511.38 rows=9127557 width=8) (actual time=3041.072..4436.569 rows=9113219loops=1) │ │ Recheck Cond: ((l_shipdate >= '1995-01-01'::date) AND (l_shipdate <= '1996-12-31'::date)) │ │ Heap Blocks: exact=585542 │ │ -> Bitmap Index Scan on i_l_shipdate (cost=0.00..191388.13 rows=9127557 width=0) (actual time=2807.246..2807.246rows=9113219 loops=1) │ │ Index Cond: ((l_shipdate >= '1995-01-01'::date) AND (l_shipdate <= '1996-12-31'::date)) │ │ Planning time: 0.077 ms │ │ Execution time: 5284.038 ms │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ (8 rows) after: tpch[21953][1]=# EXPLAIN ANALYZE SELECT SUM(l_extendedprice) FROM lineitem WHERE (l_shipdate >= '1995-01-01'::date) AND (l_shipdate<= '1996-12-31'::date); ┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Aggregate (cost=942330.27..942330.28 rows=1 width=8) (actual time=3431.602..3431.602 rows=1 loops=1) │ │ -> Bitmap Heap Scan on lineitem (cost=193670.02..919511.38 rows=9127557 width=8) (actual time=1158.783..2588.357 rows=9113219loops=1) │ │ Recheck Cond: ((l_shipdate >= '1995-01-01'::date) AND (l_shipdate <= '1996-12-31'::date)) │ │ Heap Blocks: exact=585542 │ │ -> Bitmap Index Scan on i_l_shipdate (cost=0.00..191388.13 rows=9127557 width=0) (actual time=958.341..958.341rows=9113219 loops=1) │ │ Index Cond: ((l_shipdate >= '1995-01-01'::date) AND (l_shipdate <= '1996-12-31'::date)) │ │ Planning time: 0.070 ms │ │ Execution time: 3435.259 ms │ └────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ (8 rows) Hash-Agg: before: tpch[22005][1]=# EXPLAIN (ANALYZE, TIMING OFF) SELECT l_partkey, SUM(l_extendedprice) FROM lineitem GROUP BY 1 ORDER BY 2DESC LIMIT 10; ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Limit (cost=1069653.05..1069653.08 rows=10 width=12) (actual rows=10 loops=1) │ │ -> Sort (cost=1069653.05..1072083.33 rows=972112 width=12) (actual rows=10 loops=1) │ │ Sort Key: (sum(l_extendedprice)) DESC │ │ Sort Method: top-N heapsort Memory: 25kB │ │ -> HashAggregate (cost=1038924.94..1048646.06 rows=972112 width=12) (actual rows=1000000 loops=1) │ │ Group Key: l_partkey │ │ -> Seq Scan on lineitem (cost=0.00..888925.96 rows=29999796 width=12) (actual rows=29999795 loops=1) │ │ Planning time: 0.210 ms │ │ Execution time: 20887.906 ms │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ (9 rows) after: tpch[22342][1]=# EXPLAIN (ANALYZE, TIMING OFF) SELECT l_partkey, SUM(l_extendedprice) FROM lineitem GROUP BY 1 ORDER BY 2DESC LIMIT 10; ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Limit (cost=1069653.05..1069653.08 rows=10 width=12) (actual rows=10 loops=1) │ │ -> Sort (cost=1069653.05..1072083.33 rows=972112 width=12) (actual rows=10 loops=1) │ │ Sort Key: (sum(l_extendedprice)) DESC │ │ Sort Method: top-N heapsort Memory: 25kB │ │ -> HashAggregate (cost=1038924.94..1048646.06 rows=972112 width=12) (actual rows=1000000 loops=1) │ │ Group Key: l_partkey │ │ -> Seq Scan on lineitem (cost=0.00..888925.96 rows=29999796 width=12) (actual rows=29999795 loops=1) │ │ Planning time: 0.104 ms │ │ Execution time: 14617.481 ms │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ (9 rows) The hash-table performance difference itself is bigger in both cases, but other things, notably slot deforming and expression evaluation, becomes the bottleneck. Does anybody have a better idea how to approach the hash-table problem? While it'd be nice not to resort to such macro-magic, I can't think of an alternative short of switching to C++ and using templates. Currently this is used like: #define SH_PREFIX pagetable #define SH_KEYTYPE BlockNumber #define SH_KEY blockno #define SH_CONTAINS PagetableEntry #define SH_HASH_KEY(tb, key) hash_blockno(key) #define SH_EQUAL(tb, a, b) a == b #define SH_SCOPE static #define SH_DEFINE #define SH_DECLARE #include "lib/simplehash.h" which then creates functions like pagetable_create/insert/lookup/delete/... I strongly suspect there are some other cases that could benefit from another hash-table implementation. Particularly catcache.c seems like a candidate (instead of it's hand-rolled stuff, which frequently shows up in profiles). Note that these patches are *NOT* in a shape for in-detail review. I'd first like to know what people think about the general approach, and about better alternatives. Also note that some queries in the regression test change their result, because the ordering is unspecified... Greetings, Andres Freund [1] http://archives.postgresql.org/message-id/20160714030616.dvlkboz6cw555ixb%40alap3.anarazel.de
Attachment
pgsql-hackers by date: