Re: Status of DISTINCT-by-hashing work - Mailing list pgsql-hackers

From Asko Oja
Subject Re: Status of DISTINCT-by-hashing work
Date
Msg-id ecd779860808050835g1107b22ar73a77e6b09c2b6fb@mail.gmail.com
Whole thread Raw
In response to Status of DISTINCT-by-hashing work  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Sounds very much like 80% 20% story. 80% that was easy to do is done and now 20% that is complex and progress is slow is left to be done. Sounds very familiar from the comment in plan cache invalidation :)

On Tue, Aug 5, 2008 at 5:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I've pretty much finished the project I got a bee in my bonnet about
last week, which is to teach SELECT DISTINCT how to (optionally) use
hashing for grouping in the same way that GROUP BY has been able to do
for awhile.

There are still two places in the system that hard-wire the use of
sorting for duplicate elimination:

* Set operations (UNION/INTERSECT/EXCEPT)

* Aggregate functions with DISTINCT

I'm thinking of trying to fix set operations before I leave this topic,
but I'm not sure it's worth the trouble to change DISTINCT aggregates.
They'd be a lot more work (since there's no executor infrastructure
in place that could be used) and the return on investment seems low.

Comments?

                       regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: small improvement in buffread common
Next
From: Gregory Stark
Date:
Subject: Re: Status of DISTINCT-by-hashing work