Re: GROUP BY DISTINCT - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: GROUP BY DISTINCT |
Date | |
Msg-id | 35077b31-2d62-1e31-0e2e-ddb52d590b73@enterprisedb.com Whole thread Raw |
In response to | Re: GROUP BY DISTINCT (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Responses |
Re: GROUP BY DISTINCT
|
List | pgsql-hackers |
On 3/18/21 6:25 PM, Tomas Vondra wrote: > On 3/16/21 3:52 PM, Tomas Vondra wrote: >> >> >> On 3/16/21 9:21 AM, Vik Fearing wrote: >>> On 3/13/21 12:33 AM, Tomas Vondra wrote: >>>> Hi Vik, >>>> >>>> The patch seems quite ready, I have just two comments. >>> >>> Thanks for taking a look. >>> >>>> 1) Shouldn't this add another <indexterm> for DISTINCT, somewhere in the >>>> documentation? Now the index points just to the SELECT DISTINCT part. >>> >>> Good idea; I never think about the index. >>> >>>> 2) The part in gram.y that wraps/unwraps the boolean flag as an integer, >>>> in order to stash it in the group lists is rather ugly, IMHO. It forces >>>> all the places handling the list to be aware of this (there are not >>>> many, but still ...). And there are no other places doing (bool) intVal >>>> so it's not like there's a precedent for this. >>> >>> There is kind of a precedent for it, I was copying off of TriggerEvents >>> and func_alias_clause. >>> >> >> I see. I was looking for "(bool) intVal" but you're right TriggerEvents >> code does something similar. >> >>>> I think the clean solution is to make group_clause produce a struct with >>>> two fields, and just use that. Not sure how invasive that will be >>>> outside gram.y, though. >>> >>> I didn't want to create a whole new parse node for it, but Andrew Gierth >>> pointed me towards SelectLimit so I did it like that and I agree it is >>> much cleaner. >>> >> >> I agree, that's much cleaner. >> >>>> Also, the all_or_distinct vs. distinct_or_all seems a bit error-prone. I >>>> wonder if we can come up with some clearer names, describing the context >>>> of those types. >>> >>> I turned this into an enum for ALL/DISTINCT/default and the caller can >>> choose what it wants to do with default. I think that's a lot cleaner, >>> too. Maybe DISTINCT ON should be changed to fit in that? I left it >>> alone for now. >>> >> >> I think DISTINCT ON is a different kind of animal, because that is a >> list of expressions, not just a simple enum state. >> >>> I also snuck in something that all of us overlooked which is outputting >>> the DISTINCT in ruleutils.c. I didn't add a test for it but that would >>> have been an unfortunate bug. >>> >> >> Oh! >> >>> New patch attached, rebased on 15639d5e8f. >>> >> >> Thanks. At this point it seems fine to me, no further comments. >> > > Pushed. Thanks for the patch. > Hmmm, this seems to fail on lapwing with this error: parse_agg.c: In function 'expand_grouping_sets': parse_agg.c:1851:23: error: value computed is not used [-Werror=unused-value] cc1: all warnings being treated as errors That line is this: foreach_delete_current(result, cell); and I don't see how any of the values close by could be unused ... The only possibility I can think of is some sort of issue in the old-ish gcc release (4.7.2). regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: