Re: Using ORDER BY with AGGREGATE/GROUP BY in a SELECT statement - Mailing list pgsql-sql
From | David D. Kilzer |
---|---|
Subject | Re: Using ORDER BY with AGGREGATE/GROUP BY in a SELECT statement |
Date | |
Msg-id | 20010514094215.C17879@elbonia.lubricants-oil.com Whole thread Raw |
In response to | Re: Using ORDER BY with AGGREGATE/GROUP BY in a SELECT statement (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-sql |
Tom, Thanks for the reply, and for correcting my broken SQL. :^) A couple days after I sent my own follow-up message, I realized how silly my question was. I'll let future questions sit a bit longer next time. To make up for my silly question, here is a function + aggregate I created while working on a different aspect of my original problem. Perhaps someone will find this useful. The function, 'booland', returns the logical 'AND' of two input values. DROP FUNCTION "booland" (bool, bool); CREATE FUNCTION "booland" (bool, bool) RETURNS bool AS 'BEGIN RETURN $1 AND $2; END;' LANGUAGE 'PLPGSQL'; The aggregate, 'logical_and', returns the logical 'AND' of all values in a column in an aggregated query. DROP AGGREGATE logical_and bool; CREATE AGGREGATE logical_and ( SFUNC1 = booland, BASETYPE = bool, STYPE1 = bool, INITCOND1 = 't' ); Obviously, this concept could be extended for a logical 'OR' function and aggregate. Dave On Fri, May 11, 2001 at 12:33:52PM -0400, Tom Lane wrote: > "David D. Kilzer" <ddkilzer@lubricants-oil.com> writes: > > [ wants to write an aggregate function that returns its last input ] > > The SQL model of query processing has a very definite view of the stages > of processing: first group by, then aggregate, and last order by. Tuple > ordering is irrelevant according to the basic semantics of the language. > Probably the SQL authors would have left out ORDER BY entirely if they > could have got away with it, but instead they made it a vestigial > appendage that is only allowed at the very last instant before query > outputs are forwarded to a client application. > > Thus, it is very bad form to write an aggregate that depends on the > order it sees its inputs in. This won't be changed, because it's part > of the nature of the language. > > In PG 7.1 it's possible to hack around this by ordering the result of > a subselect-in-FROM: > > SELECT orderedagg(ss.x) FROM (select x from tab order by y) ss; > > which is a gross violation of the letter and spirit of the spec, and > should not be expected to be portable to other DBMSes; but it gets the > job done if you are intent on writing an ordering-dependent aggregate. > > However, I don't see any good way to combine this with grouping, since > if you apply GROUP BY to the output of the subselect you'll lose the > ordering again. > > > SELECT r.personid AS personid > > ,SUM(r.laps) AS laps > > ,COUNT(DISTINCT r.id) AS nightsraced > > ,(SELECT r.carid > > FROM race r > > WHERE r.personid = 14 > > ORDER BY r.date DESC > > LIMIT 1) AS carid > > FROM race r > > WHERE r.personid = 14 > > GROUP BY r.personid > > ORDER BY r.date; > > This is likely to be reasonably efficient, actually, since the subselect > will be evaluated only once per output group --- in fact, as you've > written it it'll only be evaluated once, period, since it has no > dependencies on the outer query. More usually you'd probably do > > ,(SELECT r2.carid > FROM race r2 > WHERE r2.personid = r.personid > ORDER BY r2.date DESC > LIMIT 1) AS carid > > so that the result tracks the outer query, and in this form it'd be > redone once per output row. > > regards, tom lane