Thread: description of Aggregate Expressions
In PostgreSQL 12.1 Documentation chapter 4.2.7. Aggregate Expressions it says The syntax of an aggregate expression is one of the following: ... aggregate_name (DISTINCT expression [ , ... ] [ order_by_clause ] ) [ FILTER ( WHERE filter_clause ) ] ... I believe this is incorrect in the case where the DISTINCT is on a comma-separated list of expressions. It would imply that this is legal select count(DISTINCT parent_id , name) from mytable but that is rejected with ERROR: function count(bigint, text) does not exist whereas select count(DISTINCT ( parent_id , name) ) from mytable is accepted. So I think to handle all cases the line in the doc should read aggregate_name (DISTINCT ( expression [ , ... ] ) [ order_by_clause ] ) [ FILTER ( WHERE filter_clause ) ] I don't know how to indicate that those extra parentheses can be omitted if the list has only one expression. Cheers, John Lumby
On Thu, Dec 5, 2019 at 3:18 PM John Lumby <johnlumby@hotmail.com> wrote:
In PostgreSQL 12.1 Documentation chapter 4.2.7. Aggregate Expressions it says
The syntax of an aggregate expression is one of the following:
...
aggregate_name (DISTINCT expression [ , ... ] [ order_by_clause ] ) [ FILTER ( WHERE filter_clause ) ]
...
I believe this is incorrect in the case where the DISTINCT is on a comma-separated list of expressions.
It would imply that this is legal
It is...you didn't get a syntax error.
select count(DISTINCT parent_id , name) from mytable
but that is rejected with
ERROR: function count(bigint, text) does not exist
The error is that while the query is syntactically correct in order to execute it as written a function would need to exist that does not. As far as a general syntax diagram goes it has correctly communicated what is legal.
whereas
select count(DISTINCT ( parent_id , name) ) from mytable
is accepted.
Correct, converting the two individual columns into a "tuple" allows the default tuple distinct-making infrastructure to be used to execute the query.
So I think to handle all cases the line in the doc should read
aggregate_name (DISTINCT ( expression [ , ... ] ) [ order_by_clause ] ) [ FILTER ( WHERE filter_clause ) ]
I don't know how to indicate that those extra parentheses can be omitted if the list has only one expression.
Then I would have to say the proposed solution to this edge case is worse than the problem. I also don't expect there to be a clean solution to dealing with the complexities of expressions at the syntax diagram level.
David J.
"David G. Johnston" <david.g.johnston@gmail.com> writes: > On Thu, Dec 5, 2019 at 3:18 PM John Lumby <johnlumby@hotmail.com> wrote: >> whereas >> select count(DISTINCT ( parent_id , name) ) from mytable >> is accepted. > Correct, converting the two individual columns into a "tuple" allows the > default tuple distinct-making infrastructure to be used to execute the > query. Yeah. This might be more intelligible if it were written select count(DISTINCT ROW(parent_id, name) ) from mytable However, the SQL committee in their finite wisdom have decreed that the ROW keyword is optional. (As long as there's more than one column expression; the need for that special case is another reason why omitting ROW isn't really a nice thing to do.) regards, tom lane
On 12/05/19 18:06, David G. Johnston wrote:
Hmm, even though true, I think this is unhelpful.
If a reference document states that the syntax for a something-or-other construct is one of the following diagrams,
then I expect that the diagrams are valid for *every* kind of something-or-other, not just some.
Yet the diagram I quote always results in being rejected in the case of COUNT -
which I consider to be as good as saying it is invalid syntax.
Yes, I see what I suggested is not ideal either. But I think something needs to be changed.
How about replacing "expression [ , ... ]" by "parameter_list" in the description, and then stating that parameter_list can be either a single expression or , if the particular aggregate function accepts it (for which, consult that function's reference), a comma-separated list of expressions.
On Thu, Dec 5, 2019 at 3:18 PM John Lumby <johnlumby@hotmail.com> wrote:In PostgreSQL 12.1 Documentation chapter 4.2.7. Aggregate Expressions it says
The syntax of an aggregate expression is one of the following:
...
aggregate_name (DISTINCT expression [ , ... ] [ order_by_clause ] ) [ FILTER ( WHERE filter_clause ) ]
...
I believe this is incorrect in the case where the DISTINCT is on a comma-separated list of expressions.
It would imply that this is legalIt is...you didn't get a syntax error.
Hmm, even though true, I think this is unhelpful.
If a reference document states that the syntax for a something-or-other construct is one of the following diagrams,
then I expect that the diagrams are valid for *every* kind of something-or-other, not just some.
Yet the diagram I quote always results in being rejected in the case of COUNT -
which I consider to be as good as saying it is invalid syntax.
select count(DISTINCT parent_id , name) from mytable
but that is rejected with
ERROR: function count(bigint, text) does not existThe error is that while the query is syntactically correct in order to execute it as written a function would need to exist that does not. As far as a general syntax diagram goes it has correctly communicated what is legal.whereas
select count(DISTINCT ( parent_id , name) ) from mytable
is accepted.Correct, converting the two individual columns into a "tuple" allows the default tuple distinct-making infrastructure to be used to execute the query.
So I think to handle all cases the line in the doc should read
aggregate_name (DISTINCT ( expression [ , ... ] ) [ order_by_clause ] ) [ FILTER ( WHERE filter_clause ) ]
I don't know how to indicate that those extra parentheses can be omitted if the list has only one expression.Then I would have to say the proposed solution to this edge case is worse than the problem. I also don't expect there to be a clean solution to dealing with the complexities of expressions at the syntax diagram level.
Yes, I see what I suggested is not ideal either. But I think something needs to be changed.
How about replacing "expression [ , ... ]" by "parameter_list" in the description, and then stating that parameter_list can be either a single expression or , if the particular aggregate function accepts it (for which, consult that function's reference), a comma-separated list of expressions.
David J.
John Lumby <johnlumby@hotmail.com> writes: > On 12/05/19 18:06, David G. Johnston wrote: >> Then I would have to say the proposed solution to this edge case is worse than the problem. I also don't expect thereto be a clean solution to dealing with the complexities of expressions at the syntax diagram level. > Yes, I see what I suggested is not ideal either. But I think something needs to be changed. > How about replacing "expression [ , ... ]" by "parameter_list" in the description, and then stating that parameter_listcan be either a single expression or , if the particular aggregate function accepts it (for which, consultthat function's reference), a comma-separated list of expressions. That's just as wrong. As we tried to explain before, the parenthesized-list syntax is a row constructor, so it only works in cases where the aggregate function can accept a composite argument type. Most don't. Moreover, the very same thing holds in *any* expression context, not only aggregate arguments. So if we took this seriously there would have to be a lot of other places plastered with equally confusing/unhelpful addenda. regards, tom lane
Tom Lane wrote > > How about replacing "expression [ , ... ]" by "parameter_list" in the description, > > and then stating that parameter_list can be either a single expression or , > > if the particular aggregate function accepts it (for which, consult that function's reference), > > a comma-separated list of expressions. > > That's just as wrong. As we tried to explain before, the > parenthesized-list syntax is a row constructor, so it only works > in cases where the aggregate function can accept a composite > argument type. Most don't. > But surely not *all* cases of a multi-expression parameter list of an aggregate function are row constructors are they? What about select parent_id, name, GROUPING(parent_id , name), count(*) FROM mytable GROUP BY ROLLUP(parent_id , name); In the piece "GROUPING(parent_id , name)" , is "(parent_id , name)" a row constructor or a list of two expressions? Or are you saying those are one and the same thing? Cheers, John
John Lumby wrote: < < Tom Lane wrote < < > > How about replacing "expression [ , ... ]" by "parameter_list" in the description, < > > and then stating that parameter_list can be either a single expression or , < > > if the particular aggregate function accepts it (for which, consult that function's reference), < > > a comma-separated list of expressions. < > < > That's just as wrong. As we tried to explain before, the < > parenthesized-list syntax is a row constructor, so it only works < > in cases where the aggregate function can accept a composite < > argument type. Most don't. < > < < But surely not *all* cases of a multi-expression parameter list of an aggregate function are row constructors are they? What about < < select parent_id, name, GROUPING(parent_id , name), count(*) FROM mytable GROUP BY ROLLUP(parent_id , name); < < In the piece "GROUPING(parent_id , name)" , < is "(parent_id , name)" a row constructor or a list of two expressions? < < Or are you saying those are one and the same thing? < I think I can answer my own question - No they are not the same - because : select parent_id, name, GROUPING(ROW(parent_id , name)), count(*) FROM mytable GROUP BY ROLLUP(parent_id , name); ERROR: arguments to GROUPING must be grouping expressions of the associated query level LINE 1: select parent_id, name, GROUPING(ROW(parent_id , name)), cou... ^