Thread: DISTINCT -> GROUP BY
Folks, I've noticed that queries of the form SELECT DISTNCT foo, bar, baz FROM quux WHERE ... perform significantly worse than the equivalent using GROUP BY. SELECT foo, bar, baz FROM quux WHERE ... GROUP BY foo, bar, baz Where would I start looking in order to make them actually equivalent from the planner's point of view? Also, would back-patching this make sense? It doesn't change any APIs, but it does make some queries go faster. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Hello David http://www.postgresql.org/docs/faqs.TODO.html Consider using hash buckets to do DISTINCT, rather than sorting This would be beneficial when there are few distinct values. This is already used by GROUP BY. Regards Pavel Stehule 2008/6/3 David Fetter <david@fetter.org>: > Folks, > > I've noticed that queries of the form > > SELECT DISTNCT foo, bar, baz > FROM quux > WHERE ... > > perform significantly worse than the equivalent using GROUP BY. > > SELECT foo, bar, baz > FROM quux > WHERE ... > GROUP BY foo, bar, baz > > Where would I start looking in order to make them actually equivalent > from the planner's point of view? Also, would back-patching this make > sense? It doesn't change any APIs, but it does make some queries go > faster. > > Cheers, > David. > -- > David Fetter <david@fetter.org> http://fetter.org/ > Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter > Skype: davidfetter XMPP: david.fetter@gmail.com > > Remember to vote! > Consider donating to Postgres: http://www.postgresql.org/about/donate > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
On Tue, Jun 03, 2008 at 03:36:44PM +0200, Pavel Stehule wrote: > Hello David > > http://www.postgresql.org/docs/faqs.TODO.html > > Consider using hash buckets to do DISTINCT, rather than sorting This > would be beneficial when there are few distinct values. This is > already used by GROUP BY. It's nice to see that this is kinda on the TODO, but it doesn't address the question I asked, which is, "how would I get the planner to rewrite DISTINCTs as the equivalent GROUP BYs?" :) Any hints? Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
2008/6/3 David Fetter <david@fetter.org>: > On Tue, Jun 03, 2008 at 03:36:44PM +0200, Pavel Stehule wrote: >> Hello David >> >> http://www.postgresql.org/docs/faqs.TODO.html >> >> Consider using hash buckets to do DISTINCT, rather than sorting This >> would be beneficial when there are few distinct values. This is >> already used by GROUP BY. > > It's nice to see that this is kinda on the TODO, but it doesn't > address the question I asked, which is, "how would I get the planner > to rewrite DISTINCTs as the equivalent GROUP BYs?" :) you can't to do it :( Pavel > > Any hints? > > Cheers, > David. > -- > David Fetter <david@fetter.org> http://fetter.org/ > Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter > Skype: davidfetter XMPP: david.fetter@gmail.com > > Remember to vote! > Consider donating to Postgres: http://www.postgresql.org/about/donate >
On Jun 3, 2008, at 9:03 AM, David Fetter wrote: > On Tue, Jun 03, 2008 at 03:36:44PM +0200, Pavel Stehule wrote: >> Hello David >> >> http://www.postgresql.org/docs/faqs.TODO.html >> >> Consider using hash buckets to do DISTINCT, rather than sorting This >> would be beneficial when there are few distinct values. This is >> already used by GROUP BY. > > It's nice to see that this is kinda on the TODO, but it doesn't > address the question I asked, which is, "how would I get the planner > to rewrite DISTINCTs as the equivalent GROUP BYs?" :) IIRC, the patch to allow GROUP BY to do hashing is fairly recent... you could probably find it in archives. -- Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828