Re: Parallel grouping sets - Mailing list pgsql-hackers
From | Jesse Zhang |
---|---|
Subject | Re: Parallel grouping sets |
Date | |
Msg-id | CAGf+fX5288z0Gt_UL9FYHvNcySvG1wrKxPQ7+zretbcaUDc5Jw@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel grouping sets (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Parallel grouping sets
Re: Parallel grouping sets |
List | pgsql-hackers |
On Thu, Jan 23, 2020 at 2:47 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Sun, Jan 19, 2020 at 2:23 PM Richard Guo <riguo@pivotal.io> wrote: > > > > I realized that there are two patches in this thread that are > > implemented according to different methods, which causes confusion. > > > > Both the idea seems to be different. Is the second approach [1] > inferior for any case as compared to the first approach? Can we keep > both approaches for parallel grouping sets, if so how? If not, then > won't the code by the first approach be useless once we commit second > approach? > > > [1] - https://www.postgresql.org/message-id/CAN_9JTwtTTnxhbr5AHuqVcriz3HxvPpx1JWE--DCSdJYuHrLtA%40mail.gmail.com > I glanced over both patches. Just the opposite, I have a hunch that v3 is always better than v5. Here's my 6-minute understanding of both. v5 (the one with a simple partial aggregate) works by pushing a little bit of partial aggregate onto workers, and perform grouping aggregate above gather. This has two interesting outcomes: we can execute unmodified partial aggregate on the workers, and execute almost unmodified rollup aggreegate once the trans values are gathered. A parallel plan for a query like SELECT count(*) FROM foo GROUP BY GROUPING SETS (a), (b), (c), (); can be Finalize GroupAggregate Output: count(*) Group Key: a Group Key: b Group Key: c Group Key: () Gather Merge Partial GroupAggregate Output: PARTIAL count(*) Group Key: a, b, c Sort Sort Key: a, b, c Parallel Seq Scan on foo v3 ("the one with grouping set id") really turns the plan from a tree to a multiplexed pipe: we can execute grouping aggregate on the workers, but only partially. When we emit the trans values, also tag the tuple with a group id. After gather, finalize the aggregates with a modified grouping aggregate. Unlike a non-split grouping aggregate, the finalize grouping aggregate does not "flow" the results from one rollup to the next one. Instead, each group only advances on partial inputs tagged for the group. Finalize HashAggregate Output: count(*) Dispatched by: (GroupingSetID()) Group Key: a Group Key: b Group Key: c Gather Partial GroupAggregate Output: PARTIAL count(*), GroupingSetID() Group Key: a Sort Key: b Group Key: b Sort Key: c Group Key: c Sort Sort Key: a Parallel Seq Scan on foo Note that for the first approach to be viable, the partial aggregate *has to* use a group key that's the union of all grouping sets. In cases where individual columns have a low cardinality but joint cardinality is high (say columns a, b, c each has 16 distinct values, but they are independent, so there are 4096 distinct values on (a,b,c)), this results in fairly high traffic through the shm tuple queue. Cheers, Jesse
pgsql-hackers by date: