Re: BUG #18927: Poor query plan - HashAggregate is more than twice as fast but at a higher cost - Mailing list pgsql-bugs
From | Dilip Kumar |
---|---|
Subject | Re: BUG #18927: Poor query plan - HashAggregate is more than twice as fast but at a higher cost |
Date | |
Msg-id | CAFiTN-tBLWU8GN1x-oFU3YDXdrE=wBFjCcboN_ETOsvSUnSRaQ@mail.gmail.com Whole thread Raw |
In response to | BUG #18927: Poor query plan - HashAggregate is more than twice as fast but at a higher cost (PG Bug reporting form <noreply@postgresql.org>) |
Responses |
Re: Re: BUG #18927: Poor query plan - HashAggregate is more than twice as fast but at a higher cost
|
List | pgsql-bugs |
On Wed, May 14, 2025 at 5:55 PM 萧鸿骏 <23031212454@stu.xidian.edu.cn> wrote: > > The following method can also trigger a better plan: > > database0=# set hash_mem_multiplier = 4; > SET > database0=# EXPLAIN ANALYZE SELECT t0.c0, t2.c0, t5.c0, t4.c0 FROM ONLY t0, t4 CROSS JOIN ONLY t5 JOIN ONLY t2 ON lower_inc(((t2.c0)*(t5.c0)))GROUP BY t0.c0, t2.c0, t5.c0, t4.c0; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------ > HashAggregate (cost=8854.70..10204.70 rows=135000 width=53) (actual time=196.753..284.803 rows=184320 loops=1) > Group Key: t0.c0, t2.c0, t5.c0, t4.c0 > Batches: 5 Memory Usage: 16433kB Disk Usage: 11328kB > -> Nested Loop (cost=0.00..7504.70 rows=135000 width=53) (actual time=0.046..52.982 rows=288000 loops=1) > -> Nested Loop (cost=0.00..304.70 rows=4500 width=40) (actual time=0.039..3.358 rows=9600 loops=1) > -> Nested Loop (cost=0.00..34.70 rows=180 width=27) (actual time=0.032..0.444 rows=384 loops=1) > Join Filter: lower_inc((t2.c0 * t5.c0)) > Rows Removed by Join Filter: 156 > -> Seq Scan on t2 (cost=0.00..1.20 rows=20 width=14) (actual time=0.017..0.022 rows=20 loops=1) > -> Seq Scan on t5 (cost=0.00..1.27 rows=27 width=13) (actual time=0.002..0.004 rows=27 loops=20) > -> Seq Scan on t4 (cost=0.00..1.25 rows=25 width=13) (actual time=0.001..0.004 rows=25 loops=384) > -> Seq Scan on t0 (cost=0.00..1.30 rows=30 width=13) (actual time=0.001..0.002 rows=30 loops=9600) > Planning Time: 0.109 ms > Execution Time: 291.546 ms > (14 rows) By setting a higher value for hash_mem_multiplier, you're essentially informing the planner that more memory is available for hash-based operations. During cost estimation, the planner compares the costs of HashAggregate versus GroupAggregate. With a higher memory limit for building the hash table, the planner determines that HashAggregate is more cost-effective and therefore selects it over GroupAggregate. However, during further investigation, I hadn't increased hash_mem_multiplier. Instead, I forced the use of hash aggregation by disabling other options with SET enable_incremental_sort = off; and SET enable_sort = off;. Even without adjusting hash_mem_multiplier, the HashAggregate was significantly cheaper than GroupAggregate. Are these points that we are overestimating the cost of the hash aggregate? Not sure, might need to dig further down, but one thing to note is that the planner cost shows a hash aggregate with 2x costlier than the group aggregate[1][2], whereas the actual execution shows that the hash aggregate is 4x faster than the group aggregate. Another point to be noted is that as soon as we set the hash_mem_multiplier to 4, then the HashAggregate planner cost drops from 20k to 4k [3], then I suspect this could be related to we are overestimating for the disk access as our default random page cost is quite high and might not be suited well for the faster disk, and as soon as I changed the random_page_cost to 1 it chose HashAggregate by default. Summary: IMHO, this doesn't look like an issue, instead, we just need to adjust the random page cost, which is more suited for the system under test. [1] Explain, analyze with grouping, aggregate: Group (cost=332.55..13010.57 rows=135000 width=53) (actual time=104.887..2426.446 rows=184320.00 loops=1) Group Key: t0.c0, t2.c0, t5.c0, t4.c0 Buffers: shared hit=8 -> Incremental Sort (cost=332.55..11660.57 rows=135000 width=53) (actual time=104.884..2222.501 rows=288000.00 loops=1) Sort Key: t0.c0, t2.c0, t5.c0, t4.c0 Presorted Key: t0.c0 Full-sort Groups: 30 Sort Method: quicksort Average Memory: 29kB Peak Memory: 29kB Pre-sorted Groups: 30 Sort Method: quicksort Average Memory: 1051kB Peak Memory: 1054kB Buffers: shared hit=8 -> Nested Loop (cost=0.14..1780.87 rows=135000 width=53) (actual time=0.223..77.066 rows=288000.00 loops=1) Buffers: shared hit=5 -> Index Only Scan using i0 on t0 (cost=0.14..12.59 rows=30 width=13) (actual time=0.075..0.217 rows=30.00 loops=1) Heap Fetches: 30 Index Searches: 1 Buffers: shared hit=2 -> Materialize (cost=0.00..92.03 rows=4500 width=40) (actual time=0.005..0.819 rows=9600.00 loops=30) Storage: Memory Maximum Storage: 873kB Buffers: shared hit=3 -> Nested Loop (cost=0.00..69.53 rows=4500 width=40) (actual time=0.132..4.110 rows=9600.00 loops=1) Buffers: shared hit=3 -> Nested Loop (cost=0.00..11.97 rows=180 width=27) (actual time=0.108..0.812 rows=384.00 loops=1) Join Filter: lower_inc((t2.c0 * t5.c0)) Rows Removed by Join Filter: 156 Buffers: shared hit=2 -> Seq Scan on t5 (cost=0.00..1.27 rows=27 width=13) (actual time=0.026..0.034 rows=27.00 loops=1) Buffers: shared hit=1 -> Materialize (cost=0.00..1.30 rows=20 width=14) (actual time=0.001..0.003 rows=20.00 loops=27) Storage: Memory Maximum Storage: 17kB Buffers: shared hit=1 -> Seq Scan on t2 (cost=0.00..1.20 rows=20 width=14) (actual time=0.009..0.013 rows=20.00 loops=1) Buffers: shared hit=1 -> Materialize (cost=0.00..1.38 rows=25 width=13) (actual time=0.000..0.002 rows=25.00 loops=384) Storage: Memory Maximum Storage: 18kB Buffers: shared hit=1 -> Seq Scan on t4 (cost=0.00..1.25 rows=25 width=13) (actual time=0.019..0.023 rows=25.00 loops=1) Buffers: shared hit=1 Planning: Buffers: shared hit=104 Planning Time: 2.162 ms Execution Time: 2449.420 ms [2] Explain, analyze with Hash aggregate: (SET enable_incremental_sort = off; and SET enable_sort = off;) HashAggregate (cost=16355.28..20342.00 rows=135000 width=53) (actual time=485.060..690.215 rows=184320.00 loops=1) Group Key: t0.c0, t2.c0, t5.c0, t4.c0 Planned Partitions: 4 Batches: 5 Memory Usage: 8249kB Disk Usage: 15568kB Buffers: shared hit=4, temp read=1739 written=3234 -> Nested Loop (cost=0.00..1758.41 rows=135000 width=53) (actual time=0.191..56.042 rows=288000.00 loops=1) Buffers: shared hit=4 -> Nested Loop (cost=0.00..69.53 rows=4500 width=40) (actual time=0.164..2.533 rows=9600.00 loops=1) Buffers: shared hit=3 -> Nested Loop (cost=0.00..11.97 rows=180 width=27) (actual time=0.127..0.602 rows=384.00 loops=1) Join Filter: lower_inc((t2.c0 * t5.c0)) Rows Removed by Join Filter: 156 Buffers: shared hit=2 -> Seq Scan on t5 (cost=0.00..1.27 rows=27 width=13) (actual time=0.035..0.056 rows=27.00 loops=1) Buffers: shared hit=1 -> Materialize (cost=0.00..1.30 rows=20 width=14) (actual time=0.001..0.003 rows=20.00 loops=27) Storage: Memory Maximum Storage: 17kB Buffers: shared hit=1 -> Seq Scan on t2 (cost=0.00..1.20 rows=20 width=14) (actual time=0.011..0.015 rows=20.00 loops=1) Buffers: shared hit=1 -> Materialize (cost=0.00..1.38 rows=25 width=13) (actual time=0.000..0.001 rows=25.00 loops=384) Storage: Memory Maximum Storage: 18kB Buffers: shared hit=1 -> Seq Scan on t4 (cost=0.00..1.25 rows=25 width=13) (actual time=0.026..0.031 rows=25.00 loops=1) Buffers: shared hit=1 -> Materialize (cost=0.00..1.45 rows=30 width=13) (actual time=0.000..0.001 rows=30.00 loops=9600) Storage: Memory Maximum Storage: 18kB Buffers: shared hit=1 -> Seq Scan on t0 (cost=0.00..1.30 rows=30 width=13) (actual time=0.019..0.024 rows=30.00 loops=1) Buffers: shared hit=1 Planning: Buffers: shared hit=103 Planning Time: 2.129 ms Execution Time: 698.681 ms [3] HashAggregate (cost=3108.41..4458.41 rows=135000 width=53) (actual time=489.975..612.891 rows=184320.00 loops=1) Group Key: t0.c0, t2.c0, t5.c0, t4.c0 Batches: 5 Memory Usage: 16441kB Disk Usage: 7504kB Buffers: shared hit=4, temp read=790 written=1523 -> Nested Loop (cost=0.00..1758.41 rows=135000 width=53) (actual time=0.334..55.897 rows=288000.00 loops=1) Buffers: shared hit=4 -> Nested Loop (cost=0.00..69.53 rows=4500 width=40) (actual time=0.302..2.675 rows=9600.00 loops=1) Buffers: shared hit=3 -> Nested Loop (cost=0.00..11.97 rows=180 width=27) (actual time=0.254..0.699 rows=384.00 loops=1) Join Filter: lower_inc((t2.c0 * t5.c0)) Rows Removed by Join Filter: 156 Buffers: shared hit=2 -> Seq Scan on t5 (cost=0.00..1.27 rows=27 width=13) (actual time=0.111..0.121 rows=27.00 loops=1) Buffers: shared hit=1 -> Materialize (cost=0.00..1.30 rows=20 width=14) (actual time=0.002..0.004 rows=20.00 loops=27) Storage: Memory Maximum Storage: 17kB Buffers: shared hit=1 -> Seq Scan on t2 (cost=0.00..1.20 rows=20 width=14) (actual time=0.033..0.039 rows=20.00 loops=1) Buffers: shared hit=1 -> Materialize (cost=0.00..1.38 rows=25 width=13) (actual time=0.000..0.001 rows=25.00 loops=384) Storage: Memory Maximum Storage: 18kB Buffers: shared hit=1 -> Seq Scan on t4 (cost=0.00..1.25 rows=25 width=13) (actual time=0.035..0.053 rows=25.00 loops=1) Buffers: shared hit=1 -> Materialize (cost=0.00..1.45 rows=30 width=13) (actual time=0.000..0.001 rows=30.00 loops=9600) Storage: Memory Maximum Storage: 18kB Buffers: shared hit=1 -> Seq Scan on t0 (cost=0.00..1.30 rows=30 width=13) (actual time=0.023..0.030 rows=30.00 loops=1) Buffers: shared hit=1 Planning Time: 0.763 ms Execution Time: 622.127 ms (31 rows) -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
pgsql-bugs by date: