Re: Window function optimisation, allow pushdowns of items matching PARTITION BY clauses - Mailing list pgsql-hackers
From | Thomas Mayer |
---|---|
Subject | Re: Window function optimisation, allow pushdowns of items matching PARTITION BY clauses |
Date | |
Msg-id | 534AA430.9020105@student.kit.edu Whole thread Raw |
In response to | Window function optimisation, allow pushdowns of items matching PARTITION BY clauses (David Rowley <dgrowley@gmail.com>) |
Responses |
Re: Window function optimisation, allow pushdowns of items matching
PARTITION BY clauses
|
List | pgsql-hackers |
Hello David, thanks for your work. The results look promising. What I'm missing is a test case with multiple fields in the partition by clauses: -- should push down, because partid is part of all PARTITION BY clauses explain analyze select partid,n,m from ( select partid, count(*) over (partition by partid) n, count(*) over (partitionby partid, partid+0) m from winagg ) winagg where partid=1; current production 9.3.4 is returning QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- SubqueryScan on winagg (cost=350955.11..420955.11 rows=20 width=20) (actual time=2564.360..3802.413 rows=20 loops=1) Filter: (winagg.partid = 1) Rows Removed by Filter: 1999980 -> WindowAgg (cost=350955.11..395955.11 rows=2000000 width=4) (actual time=2564.332..3657.051 rows=2000000 loops=1) -> Sort (cost=350955.11..355955.11 rows=2000000 width=4) (actual time=2564.320..2802.444 rows=2000000 loops=1) Sort Key: winagg_1.partid, ((winagg_1.partid + 0)) Sort Method: external sort Disk: 50840kB -> WindowAgg (cost=0.43..86948.43 rows=2000000 width=4) (actual time=0.084..1335.081 rows=2000000 loops=1) -> Index Only Scan using winagg_partid_idxon winagg winagg_1 (cost=0.43..51948.43 rows=2000000 width=4) (actual time=0.051..378.232 rows=2000000 loops=1) Heap Fetches: 0 "Index Only Scan" currently returns all rows (without pushdown) on current production 9.3.4. What happens with the patch you provided? -- Already Part of your tests: -- should NOT push down, because partid is NOT part of all PARTITION BY clauses explain analyze select partid,n,m from ( select partid, count(*) over (partition by partid) n, count(*) over (partitionby partid+0) m from winagg ) winagg where partid=1; Reordering the fields should also be tested: -- should push down, because partid is part of all PARTITION BY clauses -- here: partid at the end explain analyze select partid,n,m from ( select partid, count(*) over (partition by partid) n, count(*) over (partitionby partid+0, partid) m from winagg ) winagg where partid=1; -- should push down, because partid is part of all PARTITION BY clauses -- here: partid in the middle explain analyze select partid,n,m from ( select partid, count(*) over (partition by partid) n, count(*) over (partitionby partid+0, partid, partid+1) m from winagg ) winagg where partid=1; Best regards Thomas Am 13.04.2014 13:32, schrieb David Rowley: > On this thread > http://www.postgresql.org/message-id/52C6F712.6040804@student.kit.edu > there was some discussion around allowing push downs of quals that > happen to be in every window clause of the sub query. I've quickly put > together a patch which does this (see attached) > > I'm posting this just mainly to let Thomas know that I'm working on it, > per his request on the other thread. > > The patch seems to work with all my test cases, and I've not quite > gotten around to thinking of any more good cases to throw at it. > > Oh and I know that my > function var_exists_in_all_query_partition_by_clauses has no business in > allpaths.c, I'll move it out as soon as I find a better home for it. > > Here's my test case: > > drop table if exists winagg; > > create table winagg ( > id serial not null primary key, > partid int not null > ); > > insert into winagg (partid) select x.x % 100000 from > generate_series(1,2000000) x(x); > > > create index winagg_partid_idx on winagg(partid); > > > -- Should push: this should push WHERE partid=1 to the inner query as > partid is in the only parition by clause in the query. > explain analyze select partid,n from (select partid,count(*) over > (partition by partid) n from winagg) winagg where partid=1; > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------------------------- > WindowAgg (cost=4.58..82.23 rows=20 width=4) (actual > time=0.196..0.207 rows=20 loops=1) > -> Bitmap Heap Scan on winagg (cost=4.58..81.98 rows=20 width=4) > (actual time=0.102..0.170 rows=20 loops=1) > Recheck Cond: (partid = 1) > Heap Blocks: exact=20 > -> Bitmap Index Scan on winagg_partid_idx (cost=0.00..4.58 > rows=20 width=0) (actual time=0.084..0.084 rows=20 loops=1) > Index Cond: (partid = 1) > Planning time: 0.208 ms > Total runtime: 0.276 ms > (8 rows) > > -- Should not push: Added a +0 to partition by clause. > explain analyze select partid,n from (select partid,count(*) over > (partition by partid + 0) n from winagg) winagg where partid=1; > QUERY PLAN > --------------------------------------------------------------------------------------------------------------------------------------------- > Subquery Scan on winagg (cost=265511.19..330511.19 rows=20 width=12) > (actual time=2146.642..4257.267 rows=20 loops=1) > Filter: (winagg.partid = 1) > Rows Removed by Filter: 1999980 > -> WindowAgg (cost=265511.19..305511.19 rows=2000000 width=4) > (actual time=2146.614..4099.169 rows=2000000 loops=1) > -> Sort (cost=265511.19..270511.19 rows=2000000 width=4) > (actual time=2146.587..2994.993 rows=2000000 loops=1) > Sort Key: ((winagg_1.partid + 0)) > Sort Method: external merge Disk: 35136kB > -> Seq Scan on winagg winagg_1 (cost=0.00..28850.00 > rows=2000000 width=4) (actual time=0.025..418.306 rows=2000000 loops=1) > Planning time: 0.249 ms > Total runtime: 4263.933 ms > (10 rows) > > > -- Should not push: Add a window clause (which is not used) that has a > partition by clause that does not have partid > explain analyze select partid,n from (select partid,count(*) over > (partition by partid) n from winagg window stopPushDown as (partition by > id)) winagg where partid=1; > > -- Should not push: 1 window clause does not have partid > explain analyze select partid,n from (select partid,count(*) over > (partition by partid) n from winagg window stopPushDown as (order id)) > winagg where partid=1; > > -- Should not push: 1 window clause does not have partid > explain analyze select partid,n from (select partid,count(*) over > (partition by partid) n from winagg window stopPushDown as ()) winagg > where partid=1; > > As of now the patch is a couple of hours old, I've not even bothered to > run the regression tests yet, let alone add any new ones. > > Comments are welcome... > > Regards > > David Rowley > -- ====================================== Thomas Mayer Durlacher Allee 61 D-76131 Karlsruhe Telefon: +49-721-2081661 Fax: +49-721-72380001 Mobil: +49-174-2152332 E-Mail: thomas.mayer@student.kit.edu =======================================
pgsql-hackers by date: