Re: GROUP BY with reasonable timings in PLAN but unreasonable execution time - Mailing list pgsql-performance
From | Clem Dickey |
---|---|
Subject | Re: GROUP BY with reasonable timings in PLAN but unreasonable execution time |
Date | |
Msg-id | iv30d7$2881$1@news.hub.org Whole thread Raw |
In response to | GROUP BY with reasonable timings in PLAN but unreasonable execution time (Clem Dickey <dickeycl@us.ibm.com>) |
Responses |
Re: GROUP BY with reasonable timings in PLAN but unreasonable execution
time
|
List | pgsql-performance |
On 07/05/2011 07:26 PM, Clem Dickey wrote: Updates after belatedly reading the "slow queries" guidelines: Version: PostgreSQL 8.4.8 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.4.5 20101112 (Red Hat 4.4.5-2), 64-bit The query has always been slow; the table for this test case is never updated. I don't run VACUUM but do run ANALYZE. Originally all database config parameters were the default. Since yesterday I have changed shared_buffers = 224MB effective_cache_size = 1024MB but seen no change in behavior. > Column | Type | Modifiers > --------+---------+----------- > y | integer | not null > x | integer | not null > k | integer | not null > j | integer | not null > z | integer | not null > Indexes: > "t_pkey" PRIMARY KEY, btree (j, k, x, y, z) > > The table population, in pseudocode, is this: > for x in 0..9 > for y in 0..9999 > for z in 0..29 > INSERT INTO t VALUES(y,x,0,0,z) > The query is: > > SELECT * > FROM ( > SELECT * FROM t GROUP BY j,k,x,z,y > ) AS f > NATURAL JOIN t; The EXPLAIN ANALYZE output is http://explain.depesz.com/s/KGk Notes on the analysis: 1. I see that the planner estimates that GROUP BY will reduce 300K rows to 30K, a bit odd because every row which the planner could examine is in a unique group. 2. The JOIN is expected to produce one row. I'm not sure how the planner came up with that estimate. > By contrast, placing an intermediate result in a table "u" provides a > result in about 3 seconds: => EXPLAIN ANALYZE CREATE TABLE u AS SELECT * FROM t GROUP BY j,k,x,z,y;EXPLAIN ANALYZE SELECT * FROM u NATURAL JOIN t;DROP TABLE u; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- Group (cost=44508.90..49008.90 rows=30000 width=20) (actual time=1305.381..2028.385 rows=300000 loops=1) -> Sort (cost=44508.90..45258.90 rows=300000 width=20) (actual time=1305.374..1673.843 rows=300000 loops=1) Sort Key: j, k, x, z, y Sort Method: external merge Disk: 8792kB -> Seq Scan on t (cost=0.00..4911.00 rows=300000 width=20) (actual time=0.008..62.935 rows=300000 loops=1) Total runtime: 2873.590 ms (6 rows) QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------- Merge Join (cost=46229.86..72644.38 rows=1 width=20) (actual time=1420.527..2383.507 rows=300000 loops=1) Merge Cond: ((t.j = u.j) AND (t.k = u.k) AND (t.x = u.x) AND (t.y = u.y) AND (t.z = u.z)) -> Index Scan using t_pkey on t (cost=0.00..14877.18 rows=300000 width=20) (actual time=0.013..118.244 rows=300000 loops=1) -> Materialize (cost=46229.86..50123.52 rows=311493 width=20) (actual time=1420.498..1789.864 rows=300000 loops=1) -> Sort (cost=46229.86..47008.59 rows=311493 width=20) (actual time=1420.493..1692.988 rows=300000 loops=1) Sort Key: u.j, u.k, u.x, u.y, u.z Sort Method: external merge Disk: 8784kB -> Seq Scan on u (cost=0.00..5025.93 rows=311493 width=20) (actual time=0.018..78.850 rows=300000 loops=1) Total runtime: 2424.870 ms (9 rows) (Adding an "ANALYZE" on the temporary table improves the JOIN estimated fow count from 1 to about 299500, but does not change the plan.) Clem Dickey
pgsql-performance by date: