Thread: Storage location of temporary files
Hi list, I want to optimize the performance of our PostgreSQL 8.2 server. Up to now the server has a raid1 where the whole database is located (including tha WAL files). We will now move the database to a raid5 (which should be faster than the raid1) and will also move the WAL to a separate disk (which should, according to the docs, also increase the performance). But I see the temporary files as another important performance bottleneck. From the docs (chapter 17.4.1): work_mem (integer) Specifies the amount of memory to be used by internal sort operations and hash tables before switching to temporary disk files. [...] We have rather complex queries and as I far as I see from the disk usage patterns the system makes use of temporary disk files. So I would like to use a faster disk for these temporary files, too, but I could not find where the temporary files are located. Is there a separate directory? I have found a "pgsql_tmp" directory inside of the database directories ("base/<oid>/pgsql_tmp"). Is this what I'm looking for? Thanks for your help! Regards, Christian -- Deriva GmbH Tel.: +49 551 489500-42 Financial IT and Consulting Fax: +49 551 489500-91 Hans-Böckler-Straße 2 http://www.deriva.de D-37079 Göttingen Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer
Christian Schröder wrote: > So I would like to use a faster disk for these temporary files, too, > but I could not find where the temporary files are located. Is there a > separate directory? I have found a "pgsql_tmp" directory inside of the > database directories ("base/<oid>/pgsql_tmp"). Is this what I'm > looking for? Just one addition: I have found the following in the 8.3 docs: temp_tablespaces (string) This variable specifies tablespace(s) in which to create temporary objects (temp tables and indexes on temp tables) when a CREATE command does not explicitly specify a tablespace. Temporary files for purposes such as sorting large data sets are also created in these tablespace(s). So my problem seems to have been addressed in the 8.3 release. Maybe we can upgrade our database, but until that happens I will need another solution, so my question remains ... Regards, Christian -- Deriva GmbH Tel.: +49 551 489500-42 Financial IT and Consulting Fax: +49 551 489500-91 Hans-Böckler-Straße 2 http://www.deriva.de D-37079 Göttingen Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer
On Fri, Oct 31, 2008 at 09:01:29AM +0100, Christian Schrrrder wrote: > So I would like > to use a faster disk for these temporary files, too, but I could not > find where the temporary files are located. Is there a separate > directory? I have found a "pgsql_tmp" directory inside of the database > directories ("base/<oid>/pgsql_tmp"). Is this what I'm looking for? Yes, I believe it's accepted practice to replace this directory with a symlink. I've done this before and it's behaved as I'd expect. Sam
I too have used a symlink for some time (years) to put temp onto dedicated disks without any problems. I am not sure if 8.3 is different but I symlink the directory: base/pgsql_tmp Aaron Thul http://www.chasingnuts.com On Fri, Oct 31, 2008 at 8:11 AM, Sam Mason <sam@samason.me.uk> wrote: > On Fri, Oct 31, 2008 at 09:01:29AM +0100, Christian Schrrrder wrote: >> So I would like >> to use a faster disk for these temporary files, too, but I could not >> find where the temporary files are located. Is there a separate >> directory? I have found a "pgsql_tmp" directory inside of the database >> directories ("base/<oid>/pgsql_tmp"). Is this what I'm looking for? > > Yes, I believe it's accepted practice to replace this directory with a > symlink. I've done this before and it's behaved as I'd expect. > > > Sam > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
On 2008-10-31 09:01, Christian Schröder wrote: > We will now move the database to a raid5 > (which should be faster than the raid1) This is wrong. RAID5 is slower than RAID1. You should go for RAID1+0 for fast and reliable storage. Or RAID0 for even faster but unreliable. Regards Tometzky -- ...although Eating Honey was a very good thing to do, there was a moment just before you began to eat it which was better than when you were... Winnie the Pooh
Tomasz Ostrowski wrote: > This is wrong. RAID5 is slower than RAID1. > You should go for RAID1+0 for fast and reliable storage. Or RAID0 for > even faster but unreliable. > I did not find a clear statement about this. I agree that RAID10 would be better than RAID5, but in some situations RAID5 at least seems to be faster than RAID1. If I have 5 disks available, how should I use them to get best performance without the risk of severe data loss? If I use 4 of the disks to build a RAID10 then I will have only 1 remaining drive, e.g. to put the pgsql_tmp directories there. In this scenario I would not have the WAL on a separate disk. Or should I use 3 disks to build a RAID5, 1 disk for tempspace and 1 disk for WAL? How important is data integrity for the WAL? If the WAL disk fails, can this corrupt my data? Or would I just lose the data after the last checkpoint? Or maybe I should use 2 disks as RAID1 for the database, 2 disks as RAID1 for the WAL and the remaining disk for the tempspace? Regards, Christian -- Deriva GmbH Tel.: +49 551 489500-42 Financial IT and Consulting Fax: +49 551 489500-91 Hans-Böckler-Straße 2 http://www.deriva.de D-37079 Göttingen Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer
On 2008-11-05 08:13, Christian Schröder wrote: > If I have 5 disks available, how should I use them to get best > performance without the risk of severe data loss? What percentage of your usage are writes? What do you need the most: high throughput or minimal latency? > How important is data integrity for the WAL? If the WAL > disk fails, can this corrupt my data? Or would I just lose the data > after the last checkpoint? It is important. With corrupted WAL IMHO your database will not get up. > Or maybe I should use 2 disks as RAID1 for the database, 2 disks as > RAID1 for the WAL and the remaining disk for the tempspace? I'd go for 3 disks as RAID1 for database, OS and tempspace, 2 disks as RAID1 for WAL. The reasoning is that on a dedicated server OS will rarely write or read. A commit will only sync WAL, and checkpoints can be smoothed so they will not cripple reads. Reads by separate clients will not wait for each other, as they can use 3 disks concurrently, so read latency will be very good. If you do not have a disk controller with battery backed cache remember to disable hardware write cache on all disks ("hdparm -W 0 /dev/sd[a,b,c,d]" for ATA on Linux). And use smartd for periodic checking health of drives (for example in /etc/smartd.conf for ATA on Linux): /dev/sda -H -C -U -l selftest -m root -s (S/../.././01|L/../../7/03) /dev/sdb -H -C -U -l selftest -m root -s (S/../.././01|L/../../7/03) /dev/sdc -H -C -U -l selftest -m root -s (S/../.././01|L/../../7/03) /dev/sdd -H -C -U -l selftest -m root -s (S/../.././01|L/../../7/03) Regards Tometzky -- ...although Eating Honey was a very good thing to do, there was a moment just before you began to eat it which was better than when you were... Winnie the Pooh
On Wed, Nov 05, 2008 at 08:13:10AM +0100, Christian Schröder wrote: > Tomasz Ostrowski wrote: > >This is wrong. RAID5 is slower than RAID1. > >You should go for RAID1+0 for fast and reliable storage. Or RAID0 for > >even faster but unreliable. > > > I did not find a clear statement about this. I agree that RAID10 would > be better than RAID5, but in some situations RAID5 at least seems to be > faster than RAID1. The basic problem is that RAID5 has a checksum disk. So if you update a block, you need to update the checksum. No matter how you do it you need to read one or more of the parallel blocks. Clever disk controllers can reduce the cost, but not eliminate it. None of RAID 0, 1 or 10 have this problem. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Please line up in a tree and maintain the heap invariant while > boarding. Thank you for flying nlogn airlines.
Attachment
2008/11/5 Christian Schröder <cs@deriva.de>: > Tomasz Ostrowski wrote: >> >> This is wrong. RAID5 is slower than RAID1. >> You should go for RAID1+0 for fast and reliable storage. Or RAID0 for >> even faster but unreliable. >> > > I did not find a clear statement about this. I agree that RAID10 would be > better than RAID5, but in some situations RAID5 at least seems to be faster > than RAID1. For certain read heavy loads RAID-5 will beat RAID-1 handily. After all, from a read only perspective, a healthy RAID-5 with n disks is equal to a healthy RAID-0 with n-1 disks. However, writes are much more costly on RAID-5 > If I have 5 disks available, how should I use them to get best performance > without the risk of severe data loss? If I use 4 of the disks to build a > RAID10 then I will have only 1 remaining drive, e.g. to put the pgsql_tmp > directories there. In this scenario I would not have the WAL on a separate > disk. How you use your disks depends very much on your RAID controller and your workload. On a fast controller with battery backed cache, I'd normally create a 4 disk RAID-10 with one disk as a hot spare. If you don't have a battery backed caching controller, then you'd probably be better off with two 2 disk RAID-1 arrays and the 5th drive as a hot spare. Note that a 4 disk RAID-10 or 2 2 disk RAID-1 with a hot spare is much more reliable than a 5 disk RAID-5. > Or should I use 3 disks to build a RAID5, 1 disk for tempspace and 1 disk > for WAL? How important is data integrity for the WAL? If the WAL disk fails, > can this corrupt my data? Or would I just lose the data after the last > checkpoint? If your data is important, then you will put it on redundant arrays of independent disks (RAID). WAL is as important as any other part of the db. It should be on a mirror set at a minimum. > Or maybe I should use 2 disks as RAID1 for the database, 2 disks as RAID1 > for the WAL and the remaining disk for the tempspace? That could work too. But I like having a hot spare in case something dies.
"Scott Marlowe" <scott.marlowe@gmail.com> writes: > 2008/11/5 Christian Schröder <cs@deriva.de>: >> Tomasz Ostrowski wrote: >>> >>> This is wrong. RAID5 is slower than RAID1. >>> You should go for RAID1+0 for fast and reliable storage. Or RAID0 for >>> even faster but unreliable. >>> >> >> I did not find a clear statement about this. I agree that RAID10 would be >> better than RAID5, but in some situations RAID5 at least seems to be faster >> than RAID1. > > For certain read heavy loads RAID-5 will beat RAID-1 handily. After > all, from a read only perspective, a healthy RAID-5 with n disks is > equal to a healthy RAID-0 with n-1 disks. Uhm, and for a read-heavy load a RAID-1 or RAID 1+0 array with n disks is equal to a healthy RAID-0 with n disks. RAID-5 should never beat any combination of RAID-0 and RAID-1 with the same number of drives at read performance. It's advantage is that you get more capacity. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support!
On Wed, Nov 5, 2008 at 8:22 PM, Gregory Stark <stark@enterprisedb.com> wrote: > > "Scott Marlowe" <scott.marlowe@gmail.com> writes: > >> 2008/11/5 Christian Schröder <cs@deriva.de>: >>> Tomasz Ostrowski wrote: >>>> >>>> This is wrong. RAID5 is slower than RAID1. >>>> You should go for RAID1+0 for fast and reliable storage. Or RAID0 for >>>> even faster but unreliable. >>>> >>> >>> I did not find a clear statement about this. I agree that RAID10 would be >>> better than RAID5, but in some situations RAID5 at least seems to be faster >>> than RAID1. >> >> For certain read heavy loads RAID-5 will beat RAID-1 handily. After >> all, from a read only perspective, a healthy RAID-5 with n disks is >> equal to a healthy RAID-0 with n-1 disks. > > Uhm, and for a read-heavy load a RAID-1 or RAID 1+0 array with n disks is > equal to a healthy RAID-0 with n disks. Don't know what testing you've done, but very very few RAID-1 / RAID-10 setups can equal a RAID-0 setup of the same number of disks. > RAID-5 should never beat any combination of RAID-0 and RAID-1 with the same > number of drives at read performance. It's advantage is that you get more > capacity. Of course it won't beat a RAID-0 with the same number, but a good controller is just one disk behind a RAID-0 and RAID-1 controllers usually don't aggregate RAID-1 reads, but do allow multiple readers to hit different disks for better concurrent access. But that's not what I was talking about.