Re: Speeding up aggregates - Mailing list pgsql-performance
From | Joe Conway |
---|---|
Subject | Re: Speeding up aggregates |
Date | |
Msg-id | 3DF39F8B.4050801@joeconway.com Whole thread Raw |
In response to | Re: Speeding up aggregates (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Speeding up aggregates
|
List | pgsql-performance |
Tom Lane wrote: > Note that even though there's no SORT, the sort_mem setting is used > to determine the allowable hashtable size, so a too-small sort_mem > might discourage the planner from selecting hashed aggregation. > Use EXPLAIN to see which query plan gets chosen. > Just to follow up on my last post, I did indeed find that bumping up sort_mem caused a switch back to HashAggregate, and a big improvement: parts=# show sort_mem ; sort_mem ---------- 8192 (1 row) parts=# set sort_mem to 32000; SET parts=# show sort_mem ; sort_mem ---------- 32000 (1 row) parts=# explain analyze select i.part_id, sum(w.qty_oh) as total_oh from inv i, iwhs w where i.part_id = w.part_id group by i.part_id having sum(w.qty_oh) > 0; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=5254.46..5432.10 rows=35528 width=36) (actual time=1286.89..1399.36 rows=4189 loops=1) Filter: (sum(qty_oh) > 0::double precision) -> Hash Join (cost=1319.10..4710.31 rows=72553 width=36) (actual time=163.36..947.54 rows=72548 loops=1) Hash Cond: ("outer".part_id = "inner".part_id) -> Seq Scan on iwhs w (cost=0.00..2121.53 rows=72553 width=22) (actual time=0.01..266.20 rows=72553 loops=1) -> Hash (cost=1230.28..1230.28 rows=35528 width=14) (actual time=162.70..162.70 rows=0 loops=1) -> Seq Scan on inv i (cost=0.00..1230.28 rows=35528 width=14) (actual time=0.04..88.98 rows=35528 loops=1) Total runtime: 1443.93 msec (8 rows) parts=# set sort_mem to 8192; SET parts=# explain analyze select i.part_id, sum(w.qty_oh) as total_oh from inv i, iwhs w where i.part_id = w.part_id group by i.part_id having sum(w.qty_oh) > 0; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- GroupAggregate (cost=11111.93..12015.10 rows=35528 width=36) (actual time=2836.98..3261.66 rows=4189 loops=1) Filter: (sum(qty_oh) > 0::double precision) -> Sort (cost=11111.93..11293.31 rows=72553 width=36) (actual time=2836.73..2937.78 rows=72548 loops=1) Sort Key: i.part_id -> Hash Join (cost=1319.10..5254.45 rows=72553 width=36) (actual time=155.42..1258.40 rows=72548 loops=1) Hash Cond: ("outer".part_id = "inner".part_id) -> Seq Scan on iwhs w (cost=0.00..2121.53 rows=72553 width=22) (actual time=0.01..308.57 rows=72553 loops=1) -> Hash (cost=1230.28..1230.28 rows=35528 width=14) (actual time=155.19..155.19 rows=0 loops=1) -> Seq Scan on inv i (cost=0.00..1230.28 rows=35528 width=14) (actual time=0.02..86.82 rows=35528 loops=1) Total runtime: 3281.75 msec (10 rows) So when it gets used, HashAggregate has provided a factor of two improvement on this test case at least. Nice work, Tom! Joe
pgsql-performance by date: