Thread: Unexpected Name Resolution Behavior with ORDER BY COLLATE Clause in PostgreSQL 16.2
Unexpected Name Resolution Behavior with ORDER BY COLLATE Clause in PostgreSQL 16.2
In PostgreSQL 16.2 running on aarch64-unknown-linux-musl, an unexpected error occurs when executing a SQL query that orders the results by a column with a specified collation (using the COLLATE clause) after performing a join. Specifically, when attempting to order the results by a column that exists in both tables involved in the join, the query fails with an "ambiguous column reference" error if the ORDER BY clause includes a COLLATE specification. This behavior diverges from the expected and standard SQL behavior, where the column reference in the ORDER BY clause should first resolve to the columns listed in the SELECT statement before considering the source tables.
### Versions used
PostgreSQL 16.2 on aarch64-unknown-linux-musl, compiled by gcc (Alpine 13.2.1_git20231014) 13.2.1 20231014, 64-bit
### Steps to reproduce
1.start `psql`
2.create two tables, having one element in common:
```sql
CREATE TABLE Foo (id SERIAL PRIMARY KEY, descr TEXT);
CREATE TABLE Bar (id SERIAL PRIMARY KEY, descr TEXT, foo_id INT);
-- optionally, insert some data
INSERT INTO BAR (DESCR, FOO_ID) VALUES ('Description for Bar 1', 1);
INSERT INTO FOO (DESCR) VALUES ('Description for Foo 1');
```
3.execute a query, which references a column of the query in an order by with a collation:
```sql
SELECT
BAR.descr as "descr"
from BAR left join FOO on BAR.foo_id = FOO.id
order by descr COLLATE "en-x-icu" ASC;
```
4.observe the error message:
```log
ERROR: column reference "descr" is ambiguous
LINE 1: ...BAR left join FOO on BAR.foo_id = FOO.id order by descr COLL...
```
5. If you remove the collate, the query will work.
### Expected behavior
In the ANSI sql standard, the reference in the order by should first be looked up in the queries columns, and then in the tables.
This works if we omit the "COLLATE" clause. I'd expect the lookup to work the same way with the "COLLATE" clause.
Re: Unexpected Name Resolution Behavior with ORDER BY COLLATE Clause in PostgreSQL 16.2
"Bender, Patrice" <patrice.bender@sap.com> writes: > In PostgreSQL 16.2 running on aarch64-unknown-linux-musl, an > unexpected error occurs when executing a SQL query that orders the > results by a column with a specified collation (using the COLLATE > clause) after performing a join. Specifically, when attempting to > order the results by a column that exists in both tables involved in > the join, the query fails with an "ambiguous column reference" error > if the ORDER BY clause includes a COLLATE specification. This > behavior diverges from the expected and standard SQL behavior, where > the column reference in the ORDER BY clause should first resolve to > the columns listed in the SELECT statement before considering the > source tables. This behavior appears to me to be consistent with our documentation [1]: Each expression can be the name or ordinal number of an output column (SELECT list item), or it can be an arbitrary expression formed from input-column values. ... If an ORDER BY expression is a simple name that matches both an output column name and an input column name, ORDER BY will interpret it as the output column name. This is the opposite of the choice that GROUP BY will make in the same situation. This inconsistency is made to be compatible with the SQL standard. As soon as you add COLLATE, it's not a simple name anymore but an expression, and so it is not considered to possibly match an output column name. As for the question of SQL standard compliance, please cite chapter and verse supporting your position. I believe that the spec changed between SQL92 and SQL99. The SQL92 standard said that ORDER BY items could be output column names or numbers (and nothing else). Later editions say that ORDER BY expressions are just like any others, ie names in them are *input* column names. Our interpretation is a bit of a mess but is intended to preserve some compatibility with SQL92, while following the modern spec in any case that isn't strictly SQL92 compliant. (I suppose the para quoted above should be modified to explain that a little more clearly. The compatibility section at the bottom of the page has the details, but this bit is kind of misleading by itself.) regards, tom lane [1] https://www.postgresql.org/docs/current/sql-select.html#SQL-ORDERBY
Re: Unexpected Name Resolution Behavior with ORDER BY COLLATE Clause in PostgreSQL 16.2
On Mon, 2024-03-25 at 15:52 +0000, Bender, Patrice wrote: > PostgreSQL16.2 > on aarch64-unknown-linux-musl, compiled by gcc (Alpine 13.2.1_git20231014)13.2.120231014,64-bit > > CREATE TABLE Foo (id SERIAL PRIMARY KEY, descr TEXT); > CREATE TABLE Bar (id SERIAL PRIMARY KEY, descr TEXT, foo_id INT); > -- optionally, insert some data > INSERT INTO BAR (DESCR, FOO_ID) VALUES ('Description for Bar 1', 1); > INSERT INTO FOO (DESCR) VALUES ('Description for Foo 1'); > > SELECT > BAR.descr as "descr" > from BAR left join FOO on BAR.foo_id = FOO.id > order by descr COLLATE "en-x-icu" ASC; > > ERROR: column reference "descr" is ambiguous > LINE 1: ...BAR left join FOO on BAR.foo_id = FOO.id order by descr COLL... > > 5.If you remove the collate, the query will work. > > ### Expected behavior > > In the ANSI sql > standard, the reference in the order by should first be looked up in the queries columns, and > then > in the tables. > This worksif we omit the "COLLATE" clause. I'd expect the lookup to work the same way with the "COLLATE" clause. This is not a bug. If you say "ORDER BY descr", you are referencing the result set column with that alias. If you say "ORDER BY descr COLLATE "en-x-icu"" or "ORDER BY descr || ' '" or anything else that is not a plain column reference, but an expression, "descr" is *not* understood to be a result set column, but a column of one of the involved tables, and that reference is ambiguous. There is little sense in quoting the SQL standard here, because as far as I can tell it only supports column names, no expressions, in the ORDER BY clause. Yours, Laurenz Albe