Re: [PATCH] add CLUSTER table ORDER BY index - Mailing list pgsql-patches

From Heikki Linnakangas
Subject Re: [PATCH] add CLUSTER table ORDER BY index
Date
Msg-id 460A30D4.7030803@enterprisedb.com
Whole thread Raw
In response to Re: [PATCH] add CLUSTER table ORDER BY index  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [PATCH] add CLUSTER table ORDER BY index
List pgsql-patches
Tom Lane wrote:
> Gregory Stark <stark@enterprisedb.com> writes:
>> "Holger Schurig" <holgerschurig@gmx.de> writes:
>>> * psql tab-completion, it favours now CLUSTER table ORDER BY index"
>
>> It occurs to me (sorry that I didn't think of this earlier) that if we're
>> going to use "ORDER BY" it really ought to take a list columns.
>
> Surely you jest.  The point is to be ordered the same as the index, no?

There's some narrow corner cases where it makes sense to CLUSTER without
an index:

* You're going to build an index with the same order after clustering.
It's cheaper to sort the data first and then create index, than to build
index, sort data, and rebuild index.

* You're doing a lot of large sort + merge joins. Sorts are cheaper if
the data is already in order. One might ask, though, why don't you just
create an index then...

* You're using CLUSTER as a VACUUM FULL replacement, and there's no
handy index to sort with. (It'd be better if we had a VACUUM FULL that
rewrites the table like CLUSTER, though)

Though I don't think we're implementing "CLUSTER table ORDER BY col1,
col2" anytime soon, ORDER BY does imply that a list of columns is to
follow. How about "CLUSTER table USING index"?

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

pgsql-patches by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: LIMIT/SORT optimization
Next
From: "Simon Riggs"
Date:
Subject: Re: [HACKERS] Full page writes improvement, code update