Re: [HACKERS] Partition-wise aggregation/grouping - Mailing list pgsql-hackers
From | Jeevan Chalke |
---|---|
Subject | Re: [HACKERS] Partition-wise aggregation/grouping |
Date | |
Msg-id | CAM2+6=W+uRuQYLnH1kaPirwMTDb-vFJKudwB=vLumvyqYX9RpQ@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Partition-wise aggregation/grouping (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: [HACKERS] Partition-wise aggregation/grouping
Re: [HACKERS] Partition-wise aggregation/grouping Re: [HACKERS] Partition-wise aggregation/grouping Re: [HACKERS] Partition-wise aggregation/grouping |
List | pgsql-hackers |
Hi,
In this attached version, I have rebased my changes over new design of partially_grouped_rel. The preparatory changes of adding partially_grouped_rel are in 0001.
Also to minimize finalization code duplication, I have refactored them into two separate functions, finalize_sorted_partial_agg_path() and finalize_hashed_partial_agg_path(). I need to create these two functions as current path creation order in like,
Sort Agg Path
Sort Agg Path - Parallel Aware (Finalization needed here)
Hash Agg Path
Hash Agg Path - Parallel Aware (Finalization needed here)
And if we club those finalizations together, then path creation order will be changed and it may result in the existing plan changes.
Let me know if that's OK, I will merge them together as they are distinct anyways. These changes are part of 0002.
0003 - 0006 are refactoring patches as before.
0007 is the main patch per new design. I have removed create_partition_agg_paths() altogether as finalization code is reused. Also, renamed preferFullAgg with forcePartialAgg as we forcefully needed a partial path from nested level if the parent is doing a partial aggregation. add_single_path_to_append_rel() is no more exists and also there is no need to pass OtherUpperPathExtraData to add_paths_to_append_rel().
0008 - 0009, testcase and postgres_fdw changes.
Please have a look at new changes and let me know if I missed any.
Thanks
--
In this attached version, I have rebased my changes over new design of partially_grouped_rel. The preparatory changes of adding partially_grouped_rel are in 0001.
Also to minimize finalization code duplication, I have refactored them into two separate functions, finalize_sorted_partial_agg_path() and finalize_hashed_partial_agg_path(). I need to create these two functions as current path creation order in like,
Sort Agg Path
Sort Agg Path - Parallel Aware (Finalization needed here)
Hash Agg Path
Hash Agg Path - Parallel Aware (Finalization needed here)
And if we club those finalizations together, then path creation order will be changed and it may result in the existing plan changes.
Let me know if that's OK, I will merge them together as they are distinct anyways. These changes are part of 0002.
0003 - 0006 are refactoring patches as before.
0007 is the main patch per new design. I have removed create_partition_agg_paths() altogether as finalization code is reused. Also, renamed preferFullAgg with forcePartialAgg as we forcefully needed a partial path from nested level if the parent is doing a partial aggregation. add_single_path_to_append_rel() is no more exists and also there is no need to pass OtherUpperPathExtraData to add_paths_to_append_rel().
0008 - 0009, testcase and postgres_fdw changes.
Please have a look at new changes and let me know if I missed any.
Thanks
On Fri, Feb 2, 2018 at 7:29 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Feb 2, 2018 at 8:25 AM, Jeevan Chalke
<jeevan.chalke@enterprisedb.com> wrote:
>> The problem is that create_partition_agg_paths() is doing *exactly*
>> same thing that add_paths_to_grouping_rel() is already doing inside
>> the blocks that say if (grouped_rel->partial_pathlist). We don't need
>> two copies of that code. Both of those places except to take a
>> partial path that has been partially aggregated and produce a
>> non-partial path that is fully aggregated. We do not need or want two
>> copies of that code.
>
> OK. Got it.
>
> Will try to find a common place for them and will also check how it goes
> with your suggested design change.
>
>> Here's another way to look at it. We have four kinds of things.
>>
>> 1. Partially aggregated partial paths
>> 2. Partially aggregated non-partial paths
>> 3. Fully aggregated partial paths
>> 4. Fully aggregated non-partial paths
So in the new scheme I'm proposing, you've got a partially_grouped_rel
and a grouped_rel. So all paths of type #1 go into
partially_grouped_rel->partial_pathlist, paths of type #2 go into
partially_grouped_rel->pathlist, type #3 (if we have any) goes into
grouped_rel->partial_pathlist, and type #4 goes into
grouped_rel->pathlist.
> add_paths_to_append_rel() -> generate_mergeappend_paths() does not consider
> partial_pathlist. Thus we will never see MergeAppend over parallel scan
> given by partial_pathlist. And thus plan like:
> -> Gather Merge
> -> MergeAppend
> is not possible with current HEAD.
>
> Are you suggesting we should implement that here? I think that itself is a
> separate task.
Oh, I didn't realize that wasn't working already. I agree that it's a
separate task from this patch, but it's really too bad that it doesn't
already work.
--
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Attachment
pgsql-hackers by date: