Thread: Recommended Initial Settings
I would like to get someone’s recommendations on the best initial settings for a dedicated PostgreSQL server. I do realize that there are a lot of factors that influence how one should configure a database. I am just looking for a good starting point. Ideally I would like the database to reside as much as possible in memory with no disk access. The current database size of my 7.x version of PostgreSQL generates a 6 Gig file when doing a database dump.
Dedicated PostgreSQL 8.2 Server
Redhat Linux 4.x AS 64 bit version (EM64T)
4 Intel Xeon Processors
20 Gig Memory
Current PostgreSQL database is 6 Gig file when doing a database dump
/etc/sysctl.conf file settings:
# 11 Gig
kernel.shmmax = 11811160064
kernel.sem = 250 32000 100 128
net.ipv4.ip_local_port_range = 1024 65000
net.core.rmem_default = 262144
net.core.rmem_max = 262144
net.core.wmem_default = 262144
net.core.wmem_max = 262144
postgresql.conf file settings (if not listed then I used the defaults):
max_connections = 300
shared_buffers = 10240MB
work_mem = 10MB
effective_cache_size = 512MB
maintenance_work_mem = 100MB
Any suggestions would be appreciated!
Thanks,
Lance Campbell
Project Manager/Software Architect
Web Services at Public Affairs
University of Illinois
217.333.0382
http://webservices.uiuc.edu
Campbell, Lance wrote: > I would like to get someone's recommendations on the best initial > settings for a dedicated PostgreSQL server. I do realize that there are > a lot of factors that influence how one should configure a database. I > am just looking for a good starting point. Ideally I would like the > database to reside as much as possible in memory with no disk access. > The current database size of my 7.x version of PostgreSQL generates a 6 > Gig file when doing a database dump. Your operating-system should be doing the caching for you. > Dedicated PostgreSQL 8.2 Server > Redhat Linux 4.x AS 64 bit version (EM64T) > 4 Intel Xeon Processors If these are older Xeons, check the mailing list archives for "xeon context switch". > 20 Gig Memory > Current PostgreSQL database is 6 Gig file when doing a database dump OK, so it's plausible the whole thing will fit in RAM (as a rule-of-thumb I assume headers, indexes etc. triple or quadruple the size). To know better, check the actual disk-usage of $PGDATA. > /etc/sysctl.conf file settings: > > # 11 Gig > > kernel.shmmax = 11811160064 Hmm - that's a lot of shared RAM. See shared_buffers below. > kernel.sem = 250 32000 100 128 > > net.ipv4.ip_local_port_range = 1024 65000 > > net.core.rmem_default = 262144 > > net.core.rmem_max = 262144 > > net.core.wmem_default = 262144 > > net.core.wmem_max = 262144 > postgresql.conf file settings (if not listed then I used the defaults): > > max_connections = 300 How many connections do you expect typically/peak? It doesn't cost much to have max_connections set high but your workload is the most important thing missing from your question. > shared_buffers = 10240MB For 7.x that's probably way too big, but 8.x organises its buffers better. I'd still be tempted to start a 1 or 2GB and work up - see where it stops buying you an improvement. > work_mem = 10MB If you have large queries, doing big sorts I'd increase this. Don't forget it's per-sort, so if you have got about 300 connections live at any one time that could be 300*10MB*N if they're all doing something complicated. If you only have one connection live, you can increase this quite substantially. > effective_cache_size = 512MB This isn't setting PG's memory usage, it's telling PG how much data your operating-system is caching. Check "free" and see what it says. For you, I'd expect 10GB+. > maintenance_work_mem = 100MB This is for admin-related tasks, so you could probably increase it. Workload workload workload - we need to know what you're doing with it. Once connection summarising the entire database will want larger numbers than 100 connections running many small queries. HTH -- Richard Huxton Archonet Ltd
Richard, Thanks for your reply. You said: "Your operating-system should be doing the caching for you." My understanding is that as long as Linux has memory available it will cache files. Then from your comment I get the impression that since Linux would be caching the data files for the postgres database it would be redundant to have a large shared_buffers. Did I understand you correctly? Thanks, Lance Campbell Project Manager/Software Architect Web Services at Public Affairs University of Illinois 217.333.0382 http://webservices.uiuc.edu -----Original Message----- From: Richard Huxton [mailto:dev@archonet.com] Sent: Friday, February 23, 2007 10:29 AM To: Campbell, Lance Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Recommended Initial Settings Campbell, Lance wrote: > I would like to get someone's recommendations on the best initial > settings for a dedicated PostgreSQL server. I do realize that there are > a lot of factors that influence how one should configure a database. I > am just looking for a good starting point. Ideally I would like the > database to reside as much as possible in memory with no disk access. > The current database size of my 7.x version of PostgreSQL generates a 6 > Gig file when doing a database dump. Your operating-system should be doing the caching for you. > Dedicated PostgreSQL 8.2 Server > Redhat Linux 4.x AS 64 bit version (EM64T) > 4 Intel Xeon Processors If these are older Xeons, check the mailing list archives for "xeon context switch". > 20 Gig Memory > Current PostgreSQL database is 6 Gig file when doing a database dump OK, so it's plausible the whole thing will fit in RAM (as a rule-of-thumb I assume headers, indexes etc. triple or quadruple the size). To know better, check the actual disk-usage of $PGDATA. > /etc/sysctl.conf file settings: > > # 11 Gig > > kernel.shmmax = 11811160064 Hmm - that's a lot of shared RAM. See shared_buffers below. > kernel.sem = 250 32000 100 128 > > net.ipv4.ip_local_port_range = 1024 65000 > > net.core.rmem_default = 262144 > > net.core.rmem_max = 262144 > > net.core.wmem_default = 262144 > > net.core.wmem_max = 262144 > postgresql.conf file settings (if not listed then I used the defaults): > > max_connections = 300 How many connections do you expect typically/peak? It doesn't cost much to have max_connections set high but your workload is the most important thing missing from your question. > shared_buffers = 10240MB For 7.x that's probably way too big, but 8.x organises its buffers better. I'd still be tempted to start a 1 or 2GB and work up - see where it stops buying you an improvement. > work_mem = 10MB If you have large queries, doing big sorts I'd increase this. Don't forget it's per-sort, so if you have got about 300 connections live at any one time that could be 300*10MB*N if they're all doing something complicated. If you only have one connection live, you can increase this quite substantially. > effective_cache_size = 512MB This isn't setting PG's memory usage, it's telling PG how much data your operating-system is caching. Check "free" and see what it says. For you, I'd expect 10GB+. > maintenance_work_mem = 100MB This is for admin-related tasks, so you could probably increase it. Workload workload workload - we need to know what you're doing with it. Once connection summarising the entire database will want larger numbers than 100 connections running many small queries. HTH -- Richard Huxton Archonet Ltd
Campbell, Lance wrote: > Richard, > Thanks for your reply. > > You said: > "Your operating-system should be doing the caching for you." > > My understanding is that as long as Linux has memory available it will > cache files. Then from your comment I get the impression that since > Linux would be caching the data files for the postgres database it would > be redundant to have a large shared_buffers. Did I understand you > correctly? That's right - PG works with the O.S. This means it *might* not be a big advantage to have a large shared_buffers. On older versions of PG, the buffer management code wasn't great with large shared_buffers values too. -- Richard Huxton Archonet Ltd
If you're doing much updating at all you'll also want to bump up checkpoint_segments. I like setting checkpoint_warning just a bit under checkpoint_timeout as a way to monitor how often you're checkpointing due to running out of segments. With a large shared_buffers you'll likely need to make the bgwriter more aggressive as well (increase the max_pages numbers), though how important that is depends on how much updating you're doing. If you see periodic spikes in IO corresponding to checkpoints, that's an indication bgwriter isn't doing a good enough job. If everything ends up in memory, it might be good to decrease random_page_cost to 1 or something close to it; though the database should just rely on effective_cache to figure out that everything's in memory. If you're on pre-8.2, you'll want to cut all the autovacuum parameters in half, if you're using it. -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
In response to "Campbell, Lance" <lance@uiuc.edu>: > Richard, > Thanks for your reply. > > You said: > "Your operating-system should be doing the caching for you." > > My understanding is that as long as Linux has memory available it will > cache files. Then from your comment I get the impression that since > Linux would be caching the data files for the postgres database it would > be redundant to have a large shared_buffers. Did I understand you > correctly? Keep in mind that keeping the data in the kernel's buffer requires Postgres to make a syscall to read a file, which the kernel then realizes is cached in memory. The kernel then has to make that data available to the Postgres (userland) process. If the data is in Postgres' buffers, Postgres can fetch it directly, thus avoiding the overhead of the syscalls and the kernel activity. You still have to make sysvshm calls, though. So, it depends on which is able to manage the memory better. Is the kernel so much more efficient that it makes up for the overhead of the syscalls? My understanding is that in recent versions of Postgres, this is not the case, and large shared_buffers improve performance. I've yet to do any in-depth testing on this, though. -- Bill Moran Collaborative Fusion Inc.