Thread: aliases break my query
These two queries are exactly alike. The first one uses aliases except for the order by. The second uses aliases also for the order by. The third uses whole names. The third has the behavior I want. Someone please tell me what I am doing wrong. I don't want to have to use whole names for my query. The data for the tables are at the end. playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy = ta.a) from tablea ta, tableb tb order by tablea.a; a|b|c|?column? -+-+-+-------- 1|2| | 0 2|3|4| 1 3|4|5| 0 4|5|4| 0 1|2| | 0 2|3|4| 1 3|4|5| 0 4|5|4| 0 1|2| | 0 2|3|4| 0 3|4|5| 1 4|5|4| 0 1|2| | 0 2|3|4| 0 3|4|5| 0 4|5|4| 1 1|2| | 0 2|3|4| 0 3|4|5| 0 4|5|4| 0 1|2| | 0 2|3|4| 1 3|4|5| 0 4|5|4| 0 1|2| | 0 2|3|4| 1 3|4|5| 0 4|5|4| 0 1|2| | 0 2|3|4| 0 3|4|5| 1 4|5|4| 0 1|2| | 0 2|3|4| 0 3|4|5| 0 4|5|4| 1 1|2| | 0 2|3|4| 0 3|4|5| 0 4|5|4| 0 1|2| | 0 2|3|4| 1 3|4|5| 0 4|5|4| 0 1|2| | 0 2|3|4| 1 3|4|5| 0 4|5|4| 0 1|2| | 0 2|3|4| 0 3|4|5| 1 4|5|4| 0 1|2| | 0 2|3|4| 0 3|4|5| 0 4|5|4| 1 1|2| | 0 2|3|4| 0 3|4|5| 0 4|5|4| 0 1|2| | 0 2|3|4| 1 3|4|5| 0 4|5|4| 0 1|2| | 0 2|3|4| 1 3|4|5| 0 4|5|4| 0 1|2| | 0 2|3|4| 0 3|4|5| 1 4|5|4| 0 1|2| | 0 2|3|4| 0 3|4|5| 0 4|5|4| 1 1|2| | 0 2|3|4| 0 3|4|5| 0 4|5|4| 0 (80 rows) playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy = ta.a) from tablea ta, tableb tb order by ta.a; a|b|c|?column? -+-+-+-------- 1|2| | 0 1|2| | 0 1|2| | 0 1|2| | 0 1|2| | 0 2|3|4| 1 2|3|4| 1 2|3|4| 0 2|3|4| 0 2|3|4| 0 3|4|5| 0 3|4|5| 0 3|4|5| 1 3|4|5| 0 3|4|5| 0 4|5|4| 0 4|5|4| 0 4|5|4| 0 4|5|4| 1 4|5|4| 0 (20 rows) playpen=> select tablea.a,tablea.b,tablea.c, (select count (tableb.zz) where tableb.yy = tablea.a) order by tablea.a; a|b|c|?column? -+-+-+-------- 1|2| | 0 2|3|4| 2 3|4|5| 1 4|5|4| 1 (4 rows) playpen=> playpen=> select * from tablea; a|b|c -+-+- 1|2| 2|3|4 3|4|5 4|5|4 (4 rows) playpen=> select * from tableb; yy|zz --+--2| 42| 53| 94|145|15 (5 rows)
Joseph Shraibman <jks@selectacast.net> writes: > These two queries are exactly alike. The first one uses aliases except > for the order by. The second uses aliases also for the order by. The > third uses whole names. The third has the behavior I want. I think you are confusing yourself by leaving out FROM clauses. In particular, with no FROM for the inner SELECT it's not real clear what should happen there. I can tell you what *is* happening, but who's to say if it's right or wrong? > playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy = > ta.a) from tablea ta, tableb tb order by tablea.a; [ produces 80 rows ] > playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy = > ta.a) from tablea ta, tableb tb order by ta.a; [ produces 20 rows ] The difference between these two is that by explicitly specifying "tablea" in the order-by clause, you've created a three-way join, as if you had written "from tablea ta, tableb tb, tablea tablea". Once you write an alias in a from-clause entry, you must refer to that from-clause entry by its alias, not by its true table name. Meanwhile, what of the inner select? It has no FROM clause *and* no valid table names. The only way to interpret the names in it is as references to the outer select. So, on any given iteration of the outer select, the inner select collapses to constants. It looks like "SELECT count(constant1) WHERE constant2 = constant3" and so you get either 0 or 1 depending on whether tb.yy and ta.a from the outer scan are different or equal. > playpen=> select tablea.a,tablea.b,tablea.c, (select count (tableb.zz) > where tableb.yy = tablea.a) order by tablea.a; [ produces 4 rows ] Here the outer select is not a join at all --- it mentions only tablea, so you are going to get one output for each tablea row. The inner select looks like "select count (zz) FROM tableb WHERE yy = <constant>", so you get an actual scan of tableb for each iteration of the outer scan. It's not very clear from these examples what you actually wanted to have happen, but I suggest that you will have better luck if you specify explicit FROM lists in both the inner and outer selects, and be careful that each variable you use clearly refers to exactly one of the FROM-list entries. regards, tom lane
> > playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy = > > ta.a) from tablea ta, tableb tb order by tablea.a; > [ produces 80 rows ] > > playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy = > > ta.a) from tablea ta, tableb tb order by ta.a; > [ produces 20 rows ] > > playpen=> select tablea.a,tablea.b,tablea.c, (select count (tableb.zz) > > where tableb.yy = tablea.a) order by tablea.a; > [ produces 4 rows ] Once again, I think that we *really* need to discuss whether implicit range table entries in SELECT are a good idea. We invariably get a question like this every week and invariably the answer is "if you give a table an alias you *must* refer to it by that alias". (I'm sure Tom has this reply automated by now.) I claim the only thing that buys is confusion for very little convenience at the other end. Stop the madness! :) -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
Peter Eisentraut <e99re41@DoCS.UU.SE> writes: > Once again, I think that we *really* need to discuss whether implicit > range table entries in SELECT are a good idea. We invariably get a > question like this every week and invariably the answer is "if you give a > table an alias you *must* refer to it by that alias". (I'm sure Tom has > this reply automated by now.) No, this one was actually a pretty original way of shooting oneself in the foot ;-). I thought the interesting point was the confusion between whether variables in the inner select were supposed to be local to the inner select or references to the outer select. I'm not sure getting rid of implicit rangetable entries would've helped prevent that. > I claim the only thing that buys is > confusion for very little convenience at the other end. > > Stop the madness! :) I doubt that it's worth breaking a lot of existing applications for. At one time Bruce had made some patches to emit informative notice messages about implicit FROM entries, but that got turned off again for reasons that I forget... regards, tom lane
Peter Eisentraut wrote: > > > > playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy = > > > ta.a) from tablea ta, tableb tb order by tablea.a; > > [ produces 80 rows ] > > > > playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy = > > > ta.a) from tablea ta, tableb tb order by ta.a; > > [ produces 20 rows ] > > > > playpen=> select tablea.a,tablea.b,tablea.c, (select count (tableb.zz) > > > where tableb.yy = tablea.a) order by tablea.a; > > [ produces 4 rows ] > > Once again, I think that we *really* need to discuss whether implicit > range table entries in SELECT are a good idea. What is an "implicit range table entry"? We invariably get a > question like this every week and invariably the answer is "if you give a > table an alias you *must* refer to it by that alias". Hey, I *did* do that in the second query, and that still produced extra results. I tried putting the aliases in the inner select too but that didn't help. In fact the inner select always is 4 in that case. Unless I only alias tableb in the inner query, and let it get the definition of tablea from the outer query. (I'm sure Tom has > this reply automated by now.) I claim the only thing that buys is > confusion for very little convenience at the other end. > > Stop the madness! :) > > -- > Peter Eisentraut Sernanders väg 10:115 > peter_e@gmx.net 75262 Uppsala > http://yi.org/peter-e/ Sweden
Tom Lane wrote: > > Joseph Shraibman <jks@selectacast.net> writes: > > These two queries are exactly alike. The first one uses aliases except > > for the order by. The second uses aliases also for the order by. The > > third uses whole names. The third has the behavior I want. > > I think you are confusing yourself by leaving out FROM clauses. > In particular, with no FROM for the inner SELECT it's not real clear > what should happen there. I can tell you what *is* happening, but > who's to say if it's right or wrong? > Well I assumed that the aliases would be inerited from the outer query. > > playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy = > > ta.a) from tablea ta, tableb tb order by tablea.a; > [ produces 80 rows ] > > > playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy = > > ta.a) from tablea ta, tableb tb order by ta.a; > [ produces 20 rows ] > > The difference between these two is that by explicitly specifying > "tablea" in the order-by clause, you've created a three-way join, > as if you had written "from tablea ta, tableb tb, tablea tablea". > Once you write an alias in a from-clause entry, you must refer to > that from-clause entry by its alias, not by its true table name. I guess I made the mistake of assuming that SQL is logical. I don't know what I was thinking. ;) > > Meanwhile, what of the inner select? It has no FROM clause *and* > no valid table names. The only way to interpret the names in it > is as references to the outer select. So, on any given iteration > of the outer select, the inner select collapses to constants. > It looks like "SELECT count(constant1) WHERE constant2 = constant3" > and so you get either 0 or 1 depending on whether tb.yy and ta.a > from the outer scan are different or equal. OK that sorta makes sense to be. What I want is the behavior I got with the third query (below). I want the values in table a, and then a count of how many entries in tableb have the yy field of tableb that matches that entry in tablea's a field. playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) from tableb tb where tb.yy = ta.a) from tablea ta, tableb tb group by ta.a, ta.b, ta.c order by ta.a; a|b|c|?column? -+-+-+-------- 1|2| | 0 2|3|4| 2 3|4|5| 1 4|5|4| 1 (4 rows) ... which is what I want. Thanks. > > > playpen=> select tablea.a,tablea.b,tablea.c, (select count (tableb.zz) > > where tableb.yy = tablea.a) order by tablea.a; > [ produces 4 rows ] > > Here the outer select is not a join at all --- it mentions only tablea, > so you are going to get one output for each tablea row. The inner > select looks like "select count (zz) FROM tableb WHERE yy = <constant>", > so you get an actual scan of tableb for each iteration of the outer > scan. > > It's not very clear from these examples what you actually wanted to have > happen, but I suggest that you will have better luck if you specify > explicit FROM lists in both the inner and outer selects, and be careful > that each variable you use clearly refers to exactly one of the > FROM-list entries. > > regards, tom lane
> At one time Bruce had made some patches to emit informative notice > messages about implicit FROM entries, but that got turned off again > for reasons that I forget... It was triggered with common cases from the "outer join" syntax. It took a while to track down since it was introduced while I was working on the syntax feature :( If it *really* needs to be put back in, then we should do so with a flag so we can disable the warning at compile time, run time, and/or in the outer join parser area. But imho sprinkling the parser with warnings for allowed syntax is heading the wrong direction. If it is legal, allow it. If it is illegal, disallow it. If it is confusing for some, but works fine for others, it shouldn't become "sort of legal" with a warning. - Thomas