Re: SELECT DISTINCT ON removes results - Mailing list pgsql-general

From David G. Johnston
Subject Re: SELECT DISTINCT ON removes results
Date
Msg-id CAKFQuwZP0nFa1BuaZYD64V3kvXJET_7D6kwtJmdW8QR-p5uuxg@mail.gmail.com
Whole thread Raw
In response to SELECT DISTINCT ON removes results  (Guyren Howe <guyren@gmail.com>)
Responses Re: SELECT DISTINCT ON removes results
Re: SELECT DISTINCT ON removes results
List pgsql-general
On Fri, Oct 28, 2016 at 1:39 PM, Guyren Howe <guyren@gmail.com> wrote:
Using 9.5, this query:

SELECT o.id,
a.number AS awb
FROM pt.orders o
LEFT JOIN (
SELECT DISTINCT ON ((string_agg(air_way_bills.number::text, ','::text)))
string_agg(air_way_bills.number::text, ','::text) AS number,
air_way_bills.order_id
FROM pt.air_way_bills
GROUP BY air_way_bills.order_id) a ON a.order_id = o.id
gives me null for awb. Removing the DISTINCT ON clause:

SELECT o.id,
a.number AS awb
FROM pt.orders o
LEFT JOIN (
SELECT string_agg(air_way_bills.number::text, ','::text) AS number,
air_way_bills.order_id
FROM pt.air_way_bills
GROUP BY air_way_bills.order_id) a ON a.order_id = o.id
where o.id = 2792;
gives me an awb. I'm confused about how this can be.


​On its face the statement "DISTINCT ON removes results" is not at all surprising given its definition.

Given a self-contained query exhibiting the desired behavior I might be willing to figure out and explain exactly why its happening in that particular circumstance.

Oh, and on its face your DISTINCT ON query doesn't make any sense to me.  Using DISTINCT ON on one column but then joining on the discard-able ID column is...unusual.

And it also lacks an ORDER BY for deterministic discarding of duplicate rows.

David J.

pgsql-general by date:

Previous
From: Guyren Howe
Date:
Subject: SELECT DISTINCT ON removes results
Next
From: Steve Clark
Date:
Subject: Re: deadlock error - version 8.4 on CentOS 6