Thread: Cluster using tablespaces?
Hello, is there a way to instruct cluster to store the temporary created data on a different tablespace (i.e. drive)? If not, wouldn't that have a decent performance impact or is most of the time spend retrieving the data in index order? Rainer
On Nov 28, 2007, at 4:47 PM, Rainer Bauer wrote: > Hello, > > is there a way to instruct cluster to store the temporary created > data on a > different tablespace (i.e. drive)? If not, wouldn't that have a decent > performance impact or is most of the time spend retrieving the data > in index > order? > > Rainer What temporary created data are you referring to? Do you mean the contents of the the other cluster directories such as pg_xlog? If so, there's no need. Just make it a symlink to a directory on other disks and you're done. If that's not what you mean, can you explain further? Erik Jones Software Developer | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
start here http://www.postgresql.org/docs/8.1/static/creating-cluster.html M- ----- Original Message ----- From: "Erik Jones" <erik@myemma.com> To: "Rainer Bauer" <usenet@munnin.com> Cc: <pgsql-general@postgresql.org> Sent: Wednesday, November 28, 2007 5:59 PM Subject: Re: [GENERAL] Cluster using tablespaces? On Nov 28, 2007, at 4:47 PM, Rainer Bauer wrote: > Hello, > > is there a way to instruct cluster to store the temporary created > data on a > different tablespace (i.e. drive)? If not, wouldn't that have a decent > performance impact or is most of the time spend retrieving the data > in index > order? > > Rainer What temporary created data are you referring to? Do you mean the contents of the the other cluster directories such as pg_xlog? If so, there's no need. Just make it a symlink to a directory on other disks and you're done. If that's not what you mean, can you explain further? Erik Jones Software Developer | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Erik Jones wrote: >> is there a way to instruct cluster to store the temporary created >> data on a >> different tablespace (i.e. drive)? If not, wouldn't that have a decent >> performance impact or is most of the time spend retrieving the data >> in index >> order? > >What temporary created data are you referring to? The one described in the manual <http://www.postgresql.org/docs/8.2/static/sql-cluster.html>: "During the cluster operation, a temporary copy of the table is created that contains the table data in the index order. Temporary copies of each index on the table are created as well." Rainer
Rainer Bauer <usenet@munnin.com> writes: > Erik Jones wrote: >> What temporary created data are you referring to? > The one described in the manual > <http://www.postgresql.org/docs/8.2/static/sql-cluster.html>: > "During the cluster operation, a temporary copy of the table is created that > contains the table data in the index order. Temporary copies of each index on > the table are created as well." That's probably a bit misleading. There is no "temporary" copy of the table, just the new permanent copy. The document is trying to point out to you that the transient disk space requirement will be 2X the table size, but maybe we could phrase it better. For btree indexes, there is a temporary copy of the index data, which will go wherever you have arranged for temp files to go. (I think that easy user control of this may be new for 8.3, though.) regards, tom lane
Tom Lane wrote: >Rainer Bauer <usenet@munnin.com> writes: > >> "During the cluster operation, a temporary copy of the table is created that >> contains the table data in the index order. Temporary copies of each index on >> the table are created as well." > >That's probably a bit misleading. There is no "temporary" copy of the >table, just the new permanent copy. The document is trying to point out >to you that the transient disk space requirement will be 2X the table >size, but maybe we could phrase it better. Ok, I expected that. Does this work: ALTER TABLE foo CLUSTER ON index_name SET TABLESPACE new_tablespace I.e. is the table moved to the other tablespace and clustered at the same time or are these independant operations? What I am trying to achieve is cutting down the time the cluster command takes. I thought the most promising way would be if the new data is written to different drive. >For btree indexes, there is a temporary copy of the index data, which >will go wherever you have arranged for temp files to go. (I think that >easy user control of this may be new for 8.3, though.) Could you give me a hint where that would be on Windows? I guess this might be worth a try since there are a couple of btree indexes in the database. Rainer
Rainer Bauer wrote: > Tom Lane wrote: > > >Rainer Bauer <usenet@munnin.com> writes: > > > >> "During the cluster operation, a temporary copy of the table is created that > >> contains the table data in the index order. Temporary copies of each index on > >> the table are created as well." > > > >That's probably a bit misleading. There is no "temporary" copy of the > >table, just the new permanent copy. The document is trying to point out > >to you that the transient disk space requirement will be 2X the table > >size, but maybe we could phrase it better. > > Ok, I expected that. Does this work: > ALTER TABLE foo CLUSTER ON index_name SET TABLESPACE new_tablespace > > I.e. is the table moved to the other tablespace and clustered at the same time > or are these independant operations? No, the ALTER TABLE/CLUSTER ON only defines what index will the table be clustered on in the future, but it doesn't cluster it at that time. Perhaps it could be improved so that if a table rewrite is going to be done anyway for some other reason, then make sure the rewrite uses the cluster order. I think it's far from trivial though. > What I am trying to achieve is cutting down the time the cluster command > takes. I thought the most promising way would be if the new data is written to > different drive. It has been theorized that cluster would be faster in general if instead of doing an indexscan we would instead use a seqscan + sort step. It would be good to measure it. > >For btree indexes, there is a temporary copy of the index data, which > >will go wherever you have arranged for temp files to go. (I think that > >easy user control of this may be new for 8.3, though.) > > Could you give me a hint where that would be on Windows? I guess this might be > worth a try since there are a couple of btree indexes in the database. I think Tom is referring to the new temp_tablespaces config variable. -- Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J "We are who we choose to be", sang the goldfinch when the sun is high (Sandman)
On Wed, 2007-11-28 at 18:30 -0500, Tom Lane wrote: > For btree indexes, there is a temporary copy of the index data, which > will go wherever you have arranged for temp files to go. (I think that > easy user control of this may be new for 8.3, though.) In 8.2.5 is there a way to control this? I noticed that when I create a new index etc, it will hit the os disk (that's where the main tablespace is located but no data is actually being stored there except for PG's template1/0 etc table) I would rather it hit the Raid Array. I looked at the postgres.conf file but didn't see anything obvious to point out the temp location..
Ow Mun Heng <Ow.Mun.Heng@wdc.com> writes: > On Wed, 2007-11-28 at 18:30 -0500, Tom Lane wrote: >> For btree indexes, there is a temporary copy of the index data, which >> will go wherever you have arranged for temp files to go. (I think that >> easy user control of this may be new for 8.3, though.) > In 8.2.5 is there a way to control this? You can replace the pgsql_tmp subdirectory with a symlink to someplace. This is outside the purview of the database, so for instance it won't survive a dump/reload, but it can be a workable hack ... regards, tom lane
Alvaro Herrera wrote: >Rainer Bauer wrote: >> Ok, I expected that. Does this work: >> ALTER TABLE foo CLUSTER ON index_name SET TABLESPACE new_tablespace >> >> I.e. is the table moved to the other tablespace and clustered at the same time >> or are these independant operations? > >No, the ALTER TABLE/CLUSTER ON only defines what index will the table be >clustered on in the future, but it doesn't cluster it at that time. >Perhaps it could be improved so that if a table rewrite is going to be >done anyway for some other reason, then make sure the rewrite uses the >cluster order. I think it's far from trivial though. Yeah that is what I was originally looking for. >> What I am trying to achieve is cutting down the time the cluster command >> takes. I thought the most promising way would be if the new data is written to >> different drive. > >It has been theorized that cluster would be faster in general if instead >of doing an indexscan we would instead use a seqscan + sort step. It >would be good to measure it. Could a reindex on the clustered index speed up the clustering (when executed immediatelly before the cluster command)? As I understand it, this index is used to fetch the table data in the correct order. Or is most of the time spend fetching the table data? Also, would it make sense to increase <shared_buffers> for the cluster operation. This is set to 32MB here on my Windows box as was recommended. From my questions you can see that I don't know how the clustering is working internally. I.e. I don't have a concrete idea how to make cluster any faster. >> >For btree indexes, there is a temporary copy of the index data, which >> >will go wherever you have arranged for temp files to go. (I think that >> >easy user control of this may be new for 8.3, though.) >> >> Could you give me a hint where that would be on Windows? I guess this might be >> worth a try since there are a couple of btree indexes in the database. > >I think Tom is referring to the new temp_tablespaces config variable. I moved the pgsql_tmp directory to another disk, but that didn't speed up the cluster command. Rainer
Rainer Bauer wrote: > Alvaro Herrera wrote: > >It has been theorized that cluster would be faster in general if instead > >of doing an indexscan we would instead use a seqscan + sort step. It > >would be good to measure it. > > Could a reindex on the clustered index speed up the clustering (when executed > immediatelly before the cluster command)? As I understand it, this index is > used to fetch the table data in the correct order. Or is most of the time > spend fetching the table data? I haven't measured it, but my guess is that most of the time is in fetching heap pages in random order. > Also, would it make sense to increase <shared_buffers> for the cluster > operation. This is set to 32MB here on my Windows box as was recommended. Not sure. In general yes, but on Windows things are different. > >> >For btree indexes, there is a temporary copy of the index data, which > >> >will go wherever you have arranged for temp files to go. (I think that > >> >easy user control of this may be new for 8.3, though.) > >> > >> Could you give me a hint where that would be on Windows? I guess this might be > >> worth a try since there are a couple of btree indexes in the database. > > > >I think Tom is referring to the new temp_tablespaces config variable. > > I moved the pgsql_tmp directory to another disk, but that didn't speed up the > cluster command. Probably most of the time is going into creating the new table then. If you are looking for a short-term solution to your problem, maybe the best is to follow the recommendation on CLUSTER ref page: There is another way to cluster data. The CLUSTER command reorders the original table by scanning it using the index you specify. This can be slow on large tables because the rows are fetched from the table in index order, and if the table is disordered, the entries are on random pages, so there is one disk page retrieved for every row moved. (PostgreSQL has a cache, but the majority of a big table will not fit in the cache.) The other way to cluster a table is to use CREATE TABLE newtable AS SELECT * FROM table ORDER BY columnlist; which uses the PostgreSQL sorting code to produce the desired order; this is usually much faster than an index scan for disordered data. Then you drop the old table, use ALTER TABLE ... RENAME to rename newtable to the old name, and recreate the table's indexes. The big disadvantage of this approach is that it does not preserve OIDs, constraints, foreign key relationships, granted privileges, and other ancillary properties of the table — all such items must be manually recreated. Another disadvantage is that this way requires a sort temporary file about the same size as the table itself, so peak disk usage is about three times the table size instead of twice the table size. -- Alvaro Herrera http://www.amazon.com/gp/registry/CTMLCN8V17R4 "La experiencia nos dice que el hombre peló millones de veces las patatas, pero era forzoso admitir la posibilidad de que en un caso entre millones, las patatas pelarían al hombre" (Ijon Tichy)
Alvaro Herrera wrote: > Alvaro Herrera wrote: >Probably most of the time is going into creating the new table then. > >If you are looking for a short-term solution to your problem, maybe the >best is to follow the recommendation on CLUSTER ref page: I've read that section before, but I have lots of foreign key relationships between the tables. Thanks Alvaro and Tom, but it seems that I will have to live with that behaviour, until ... >No, the ALTER TABLE/CLUSTER ON only defines what index will the table be >clustered on in the future, but it doesn't cluster it at that time. >Perhaps it could be improved so that if a table rewrite is going to be >done anyway for some other reason, then make sure the rewrite uses the >cluster order. I think it's far from trivial though. ... this has been tried. Rainer