Thread: Cluster using tablespaces?

Cluster using tablespaces?

From

Rainer Bauer

Date:

28 November 2007, 18:48:00

Hello,

is there a way to instruct cluster to store the temporary created data on a
different tablespace (i.e. drive)? If not, wouldn't that have a decent
performance impact or is most of the time spend retrieving the data in index
order?

Rainer

Re: Cluster using tablespaces?

From

Erik Jones

Date:

28 November 2007, 18:59:45

On Nov 28, 2007, at 4:47 PM, Rainer Bauer wrote:

> Hello,
>
> is there a way to instruct cluster to store the temporary created
> data on a
> different tablespace (i.e. drive)? If not, wouldn't that have a decent
> performance impact or is most of the time spend retrieving the data
> in index
> order?
>
> Rainer

What temporary created data are you referring to?  Do you mean the
contents of the the other cluster directories such as pg_xlog?  If
so, there's no need.  Just make it a symlink to a directory on other
disks and you're done.  If that's not what you mean, can you explain
further?

Erik Jones

Software Developer | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com

Re: Cluster using tablespaces?

From

"Martin Gainty"

Date:

28 November 2007, 19:12:14

start here
http://www.postgresql.org/docs/8.1/static/creating-cluster.html

M-
----- Original Message -----
From: "Erik Jones" <erik@myemma.com>
To: "Rainer Bauer" <usenet@munnin.com>
Cc: <pgsql-general@postgresql.org>
Sent: Wednesday, November 28, 2007 5:59 PM
Subject: Re: [GENERAL] Cluster using tablespaces?

On Nov 28, 2007, at 4:47 PM, Rainer Bauer wrote:

> Hello,
>
> is there a way to instruct cluster to store the temporary created
> data on a
> different tablespace (i.e. drive)? If not, wouldn't that have a decent
> performance impact or is most of the time spend retrieving the data
> in index
> order?
>
> Rainer

What temporary created data are you referring to?  Do you mean the
contents of the the other cluster directories such as pg_xlog?  If
so, there's no need.  Just make it a symlink to a directory on other
disks and you're done.  If that's not what you mean, can you explain
further?

Erik Jones

Software Developer | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq

Re: Cluster using tablespaces?

From

Rainer Bauer

Date:

28 November 2007, 19:18:00

Erik Jones wrote:

>> is there a way to instruct cluster to store the temporary created
>> data on a
>> different tablespace (i.e. drive)? If not, wouldn't that have a decent
>> performance impact or is most of the time spend retrieving the data
>> in index
>> order?
>
>What temporary created data are you referring to?

The one described in the manual
<http://www.postgresql.org/docs/8.2/static/sql-cluster.html>:

"During the cluster operation, a temporary copy of the table is created that
contains the table data in the index order. Temporary copies of each index on
the table are created as well."

Rainer

Re: Cluster using tablespaces?

From

Tom Lane

Date:

28 November 2007, 19:30:30

Rainer Bauer <usenet@munnin.com> writes:
> Erik Jones wrote:
>> What temporary created data are you referring to?

> The one described in the manual
> <http://www.postgresql.org/docs/8.2/static/sql-cluster.html>:

> "During the cluster operation, a temporary copy of the table is created that
> contains the table data in the index order. Temporary copies of each index on
> the table are created as well."

That's probably a bit misleading.  There is no "temporary" copy of the
table, just the new permanent copy.  The document is trying to point out
to you that the transient disk space requirement will be 2X the table
size, but maybe we could phrase it better.

For btree indexes, there is a temporary copy of the index data, which
will go wherever you have arranged for temp files to go.  (I think that
easy user control of this may be new for 8.3, though.)

            regards, tom lane

Re: Cluster using tablespaces?

From

Rainer Bauer

Date:

28 November 2007, 20:18:15

Tom Lane wrote:

>Rainer Bauer <usenet@munnin.com> writes:
>
>> "During the cluster operation, a temporary copy of the table is created that
>> contains the table data in the index order. Temporary copies of each index on
>> the table are created as well."
>
>That's probably a bit misleading.  There is no "temporary" copy of the
>table, just the new permanent copy.  The document is trying to point out
>to you that the transient disk space requirement will be 2X the table
>size, but maybe we could phrase it better.

Ok, I expected that. Does this work:
ALTER TABLE foo CLUSTER ON index_name SET TABLESPACE new_tablespace

I.e. is the table moved to the other tablespace and clustered at the same time
or are these independant operations?

What I am trying to achieve is cutting down the time the cluster command
takes. I thought the most promising way would be if the new data is written to
different drive.

>For btree indexes, there is a temporary copy of the index data, which
>will go wherever you have arranged for temp files to go.  (I think that
>easy user control of this may be new for 8.3, though.)

Could you give me a hint where that would be on Windows? I guess this might be
worth a try since there are a couple of btree indexes in the database.

Rainer

Re: Cluster using tablespaces?

From

Alvaro Herrera

Date:

28 November 2007, 21:01:48

Rainer Bauer wrote:
> Tom Lane wrote:
>
> >Rainer Bauer <usenet@munnin.com> writes:
> >
> >> "During the cluster operation, a temporary copy of the table is created that
> >> contains the table data in the index order. Temporary copies of each index on
> >> the table are created as well."
> >
> >That's probably a bit misleading.  There is no "temporary" copy of the
> >table, just the new permanent copy.  The document is trying to point out
> >to you that the transient disk space requirement will be 2X the table
> >size, but maybe we could phrase it better.
>
> Ok, I expected that. Does this work:
> ALTER TABLE foo CLUSTER ON index_name SET TABLESPACE new_tablespace
>
> I.e. is the table moved to the other tablespace and clustered at the same time
> or are these independant operations?

No, the ALTER TABLE/CLUSTER ON only defines what index will the table be
clustered on in the future, but it doesn't cluster it at that time.
Perhaps it could be improved so that if a table rewrite is going to be
done anyway for some other reason, then make sure the rewrite uses the
cluster order.  I think it's far from trivial though.

> What I am trying to achieve is cutting down the time the cluster command
> takes. I thought the most promising way would be if the new data is written to
> different drive.

It has been theorized that cluster would be faster in general if instead
of doing an indexscan we would instead use a seqscan + sort step.  It
would be good to measure it.

> >For btree indexes, there is a temporary copy of the index data, which
> >will go wherever you have arranged for temp files to go.  (I think that
> >easy user control of this may be new for 8.3, though.)
>
> Could you give me a hint where that would be on Windows? I guess this might be
> worth a try since there are a couple of btree indexes in the database.

I think Tom is referring to the new temp_tablespaces config variable.

--
Alvaro Herrera                 http://www.amazon.com/gp/registry/DXLWNGRJD34J
"We are who we choose to be", sang the goldfinch
when the sun is high (Sandman)

Btree indexes temp copy Was [Re: Cluster using tablespaces?]

From

Ow Mun Heng

Date:

28 November 2007, 22:03:12

On Wed, 2007-11-28 at 18:30 -0500, Tom Lane wrote:

> For btree indexes, there is a temporary copy of the index data, which
> will go wherever you have arranged for temp files to go.  (I think that
> easy user control of this may be new for 8.3, though.)

In 8.2.5 is there a way to control this? I noticed that when I create a
new index etc, it will hit the os disk (that's where the main tablespace
is located but no data is actually being stored there except for PG's
template1/0 etc table)

I would rather it hit the Raid Array. I looked at the postgres.conf file
but didn't see anything obvious to point out the temp location..

Re: Btree indexes temp copy Was [Re: Cluster using tablespaces?]

From

Tom Lane

Date:

29 November 2007, 01:25:54

Ow Mun Heng <Ow.Mun.Heng@wdc.com> writes:
> On Wed, 2007-11-28 at 18:30 -0500, Tom Lane wrote:
>> For btree indexes, there is a temporary copy of the index data, which
>> will go wherever you have arranged for temp files to go.  (I think that
>> easy user control of this may be new for 8.3, though.)

> In 8.2.5 is there a way to control this?

You can replace the pgsql_tmp subdirectory with a symlink to someplace.

This is outside the purview of the database, so for instance it won't
survive a dump/reload, but it can be a workable hack ...

            regards, tom lane

Re: Cluster using tablespaces?

From

Rainer Bauer

Date:

01 December 2007, 12:18:43

Alvaro Herrera wrote:

>Rainer Bauer wrote:
>> Ok, I expected that. Does this work:
>> ALTER TABLE foo CLUSTER ON index_name SET TABLESPACE new_tablespace
>>
>> I.e. is the table moved to the other tablespace and clustered at the same time
>> or are these independant operations?
>
>No, the ALTER TABLE/CLUSTER ON only defines what index will the table be
>clustered on in the future, but it doesn't cluster it at that time.
>Perhaps it could be improved so that if a table rewrite is going to be
>done anyway for some other reason, then make sure the rewrite uses the
>cluster order.  I think it's far from trivial though.

Yeah that is what I was originally looking for.

>> What I am trying to achieve is cutting down the time the cluster command
>> takes. I thought the most promising way would be if the new data is written to
>> different drive.
>
>It has been theorized that cluster would be faster in general if instead
>of doing an indexscan we would instead use a seqscan + sort step.  It
>would be good to measure it.

Could a reindex on the clustered index speed up the clustering (when executed
immediatelly before the cluster command)? As I understand it, this index is
used to fetch the table data in the correct order. Or is most of the time
spend fetching the table data?

Also, would it make sense to increase <shared_buffers> for the cluster
operation. This is set to 32MB here on my Windows box as was recommended.

From my questions you can see that I don't know how the clustering is working
internally. I.e. I don't have a concrete idea how to make cluster any faster.

>> >For btree indexes, there is a temporary copy of the index data, which
>> >will go wherever you have arranged for temp files to go.  (I think that
>> >easy user control of this may be new for 8.3, though.)
>>
>> Could you give me a hint where that would be on Windows? I guess this might be
>> worth a try since there are a couple of btree indexes in the database.
>
>I think Tom is referring to the new temp_tablespaces config variable.

I moved the pgsql_tmp directory to another disk, but that didn't speed up the
cluster command.

Rainer

Re: Cluster using tablespaces?

From

Alvaro Herrera

Date:

02 December 2007, 09:56:45

Rainer Bauer wrote:
> Alvaro Herrera wrote:

> >It has been theorized that cluster would be faster in general if instead
> >of doing an indexscan we would instead use a seqscan + sort step.  It
> >would be good to measure it.
>
> Could a reindex on the clustered index speed up the clustering (when executed
> immediatelly before the cluster command)? As I understand it, this index is
> used to fetch the table data in the correct order. Or is most of the time
> spend fetching the table data?

I haven't measured it, but my guess is that most of the time is in
fetching heap pages in random order.

> Also, would it make sense to increase <shared_buffers> for the cluster
> operation. This is set to 32MB here on my Windows box as was recommended.

Not sure.  In general yes, but on Windows things are different.


> >> >For btree indexes, there is a temporary copy of the index data, which
> >> >will go wherever you have arranged for temp files to go.  (I think that
> >> >easy user control of this may be new for 8.3, though.)
> >>
> >> Could you give me a hint where that would be on Windows? I guess this might be
> >> worth a try since there are a couple of btree indexes in the database.
> >
> >I think Tom is referring to the new temp_tablespaces config variable.
>
> I moved the pgsql_tmp directory to another disk, but that didn't speed up the
> cluster command.

Probably most of the time is going into creating the new table then.

If you are looking for a short-term solution to your problem, maybe the
best is to follow the recommendation on CLUSTER ref page:

    There is another way to cluster data. The CLUSTER command
    reorders the original table by scanning it using the index you
    specify. This can be slow on large tables because the rows are
    fetched from the table in index order, and if the table is
    disordered, the entries are on random pages, so there is one
    disk page retrieved for every row moved. (PostgreSQL has a
    cache, but the majority of a big table will not fit in the
    cache.) The other way to cluster a table is to use

    CREATE TABLE newtable AS
        SELECT * FROM table ORDER BY columnlist;

    which uses the PostgreSQL sorting code to produce the desired
    order; this is usually much faster than an index scan for
    disordered data. Then you drop the old table, use ALTER TABLE
    ... RENAME to rename newtable to the old name, and recreate the
    table's indexes. The big disadvantage of this approach is that
    it does not preserve OIDs, constraints, foreign key
    relationships, granted privileges, and other ancillary
    properties of the table — all such items must be manually
    recreated. Another disadvantage is that this way requires a sort
    temporary file about the same size as the table itself, so peak
    disk usage is about three times the table size instead of twice
    the table size.

--
Alvaro Herrera                 http://www.amazon.com/gp/registry/CTMLCN8V17R4
"La experiencia nos dice que el hombre peló millones de veces las patatas,
pero era forzoso admitir la posibilidad de que en un caso entre millones,
las patatas pelarían al hombre" (Ijon Tichy)

Re: Cluster using tablespaces?

From

Rainer Bauer

Date:

02 December 2007, 17:18:58

Alvaro Herrera wrote:

> Alvaro Herrera wrote:
>Probably most of the time is going into creating the new table then.
>
>If you are looking for a short-term solution to your problem, maybe the
>best is to follow the recommendation on CLUSTER ref page:

I've read that section before, but I have lots of foreign key relationships
between the tables.

Thanks Alvaro and Tom, but it seems that I will have to live with that
behaviour, until ...

>No, the ALTER TABLE/CLUSTER ON only defines what index will the table be
>clustered on in the future, but it doesn't cluster it at that time.
>Perhaps it could be improved so that if a table rewrite is going to be
>done anyway for some other reason, then make sure the rewrite uses the
>cluster order.  I think it's far from trivial though.

... this has been tried.

Rainer