Thread: aliases, &c in HAVING clause?

aliases, &c in HAVING clause?

From

david@fetter.org (David Fetter)

Date:

23 February 2004, 15:48:59

Kind people,

I bumped across this several times, and am wondering what SQL99 and
SQL200x have to say about column numbers or aliases in HAVING. SQL92
is fairly clear (no) but also somewhat out of date.

Here's a scenario:

SELECT foo_name, count(*) as foo_count
FROM foo
GROUP BY foo_name
HAVING foo_count > 2
ORDER BY foo_count DESC, foo_name;

Another:

SELECT foo_name, count(*) as foo_count
FROM foo
GROUP BY foo_name
HAVING 2 > 2 -- OK, so this is a little weird, but it makes sense in context.
ORDER BY 2 DESC, 1;

I know pg throws some kind of parse error when I ask for that sort of
thing, and that replacing the alias/number with its referent clears
this up, but what do later standards have to say about doing or not
doing the above?  If they're ambiguous, are there good reasons why the
above shouldn't work?

TIA for any pointers on this :)

Cheers,
D
-- 
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100    mobile: +1 415 235 3778

Transported to a surreal landscape, a young girl kills the first
woman she meets and then teams up with three complete strangers
to kill again.         Marin County newspaper's TV listing for The Wizard of Oz

Re: aliases, &c in HAVING clause?

From

Tom Lane

Date:

23 February 2004, 19:36:57

david@fetter.org (David Fetter) writes:
> I bumped across this several times, and am wondering what SQL99 and
> SQL200x have to say about column numbers or aliases in HAVING.

SQL99 not only does not allow them in GROUP BY or HAVING, but it doesn't
allow them in ORDER BY either, thereby eliminating the entire wart from
the language.  I doubt that SQL200x will reverse field on this decision.

In retrospect it was an error for us to allow aliases in GROUP BY, as
this has caused so much confusion about where they are legal.  If it
weren't for backwards-compatibility concerns, I'd vote for adopting the
SQL99 definition (no aliases in any of these clauses).

> are there good reasons why the above shouldn't work?

Well, you showed one: interpreting "2 > 2" as anything other than a
constant expression is just plain weird.  But the real reason why this
is bogus is that it violates the fundamental conceptual model of how
SELECT works.  The SELECT output list is not supposed to be computed
until after all the other steps are complete, and therefore it's
improper to assume its results are available in GROUP BY or HAVING.
        regards, tom lane

Re: aliases, &c in HAVING clause?

From

David Fetter

Date:

23 February 2004, 19:57:17

On Mon, Feb 23, 2004 at 06:36:48PM -0500, Tom Lane wrote:
> david@fetter.org (David Fetter) writes:
> > I bumped across this several times, and am wondering what SQL99
> > and SQL200x have to say about column numbers or aliases in HAVING.
> 
> SQL99 not only does not allow them in GROUP BY or HAVING, but it
> doesn't allow them in ORDER BY either, thereby eliminating the
> entire wart from the language.  I doubt that SQL200x will reverse
> field on this decision.

Right.

> In retrospect it was an error for us to allow aliases in GROUP BY,
> as this has caused so much confusion about where they are legal.  If
> it weren't for backwards-compatibility concerns, I'd vote for
> adopting the SQL99 definition (no aliases in any of these clauses).

Hmm.  If I were going to flog for the other side, it would be on the
grounds of error prevention with aliases.

Just in general, isn't it better to write a piece of code (here, a
possibly-complicated aggregate) just once and refer to it elsewhere
rather than have to write a separate copy of it everywhere it's used?
That's one of the fundamental design principles on which structured
programming, subroutines and objects are based.

> > are there good reasons why the above shouldn't work?
> 
> Well, you showed one: interpreting "2 > 2" as anything other than a
> constant expression is just plain weird.

Um, and the rest of SQL is...normal?!? ;)

> But the real reason why this is bogus is that it violates the
> fundamental conceptual model of how SELECT works.  The SELECT output
> list is not supposed to be computed until after all the other steps
> are complete, and therefore it's improper to assume its results are
> available in GROUP BY or HAVING.

Maybe I'm missing something, but if I recall right, the "fundamental
principle" of a SELECT is that it's a PROJECTion & RESTRICTion of the
full cross-product of the relations in question.  I haven't checked,
but I'm pretty certain that PostgreSQL doesn't do things that way at
the implementation level.

Cheers,
D
-- 
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!

Re: aliases, &c in HAVING clause?

From

Tom Lane

Date:

23 February 2004, 20:09:26

David Fetter <david@fetter.org> writes:
> Just in general, isn't it better to write a piece of code (here, a
> possibly-complicated aggregate) just once and refer to it elsewhere
> rather than have to write a separate copy of it everywhere it's used?

In general, you do that with subselects.  Having inconsistent scoping
rules for ORDER BY is a much less desirable way to attack it.
SELECT * FROM  (SELECT foobar(baz) AS x FROM ...) ssGROUP BY x HAVING x > 44;

>> But the real reason why this is bogus is that it violates the
>> fundamental conceptual model of how SELECT works.  The SELECT output
>> list is not supposed to be computed until after all the other steps
>> are complete, and therefore it's improper to assume its results are
>> available in GROUP BY or HAVING.

> but I'm pretty certain that PostgreSQL doesn't do things that way at
> the implementation level.

It does anywhere that you can tell the difference.  Try a SELECT with
side-effect-producing output expressions.  As an example, would you be
happy if the following were prone to getting divide-by-zero errors?
SELECT x, 1.0 / sum(x) FROM t GROUP BY x HAVING sum(x) != 0;
        regards, tom lane