Thread: Recommended Initial Settings

Recommended Initial Settings

From

"Campbell, Lance"

Date:

23 February 2007, 12:05:41

I would like to get someone’s recommendations on the best initial settings for a dedicated PostgreSQL server. I do realize that there are a lot of factors that influence how one should configure a database. I am just looking for a good starting point. Ideally I would like the database to reside as much as possible in memory with no disk access. The current database size of my 7.x version of PostgreSQL generates a 6 Gig file when doing a database dump.

Dedicated PostgreSQL 8.2 Server

Redhat Linux 4.x AS 64 bit version (EM64T)

4 Intel Xeon Processors

20 Gig Memory

Current PostgreSQL database is 6 Gig file when doing a database dump

/etc/sysctl.conf file settings:

# 11 Gig

kernel.shmmax = 11811160064

kernel.sem = 250 32000 100 128

net.ipv4.ip_local_port_range = 1024 65000

net.core.rmem_default = 262144

net.core.rmem_max = 262144

net.core.wmem_default = 262144

net.core.wmem_max = 262144

postgresql.conf file settings (if not listed then I used the defaults):

max_connections = 300

shared_buffers = 10240MB

work_mem = 10MB

effective_cache_size = 512MB

maintenance_work_mem = 100MB

Any suggestions would be appreciated!

Thanks,

Lance Campbell

Project Manager/Software Architect

Web Services at Public Affairs

University of Illinois

217.333.0382

http://webservices.uiuc.edu

Re: Recommended Initial Settings

From

Richard Huxton

Date:

23 February 2007, 12:29:30

Campbell, Lance wrote:
> I would like to get someone's recommendations on the best initial
> settings for a dedicated PostgreSQL server.  I do realize that there are
> a lot of factors that influence how one should configure a database.  I
> am just looking for a good starting point.  Ideally I would like the
> database to reside as much as possible in memory with no disk access.
> The current database size of my 7.x version of PostgreSQL generates a 6
> Gig file when doing a database dump.

Your operating-system should be doing the caching for you.

> Dedicated PostgreSQL 8.2 Server
> Redhat Linux 4.x AS 64 bit version (EM64T)
> 4 Intel Xeon Processors

If these are older Xeons, check the mailing list archives for "xeon
context switch".

> 20 Gig Memory
> Current PostgreSQL database is 6 Gig file when doing a database dump

OK, so it's plausible the whole thing will fit in RAM (as a
rule-of-thumb I assume headers, indexes etc. triple or quadruple the
size). To know better, check the actual disk-usage of $PGDATA.

> /etc/sysctl.conf  file settings:
>
> # 11 Gig
>
> kernel.shmmax = 11811160064

Hmm - that's a lot of shared RAM. See shared_buffers below.

> kernel.sem = 250 32000 100 128
>
> net.ipv4.ip_local_port_range = 1024 65000
>
> net.core.rmem_default = 262144
>
> net.core.rmem_max = 262144
>
> net.core.wmem_default = 262144
>
> net.core.wmem_max = 262144

> postgresql.conf file settings (if not listed then I used the defaults):
>
> max_connections = 300

How many connections do you expect typically/peak? It doesn't cost much
to have max_connections set high but your workload is the most important
thing missing from your question.

> shared_buffers = 10240MB

For 7.x that's probably way too big, but 8.x organises its buffers
better. I'd still be tempted to start a 1 or 2GB and work up - see where
it stops buying you an improvement.

> work_mem = 10MB

If you have large queries, doing big sorts I'd increase this. Don't
forget it's per-sort, so if you have got about 300 connections live at
any one time that could be 300*10MB*N if they're all doing something
complicated. If you only have one connection live, you can increase this
quite substantially.

> effective_cache_size = 512MB

This isn't setting PG's memory usage, it's telling PG how much data your
operating-system is caching. Check "free" and see what it says. For you,
I'd expect 10GB+.

> maintenance_work_mem = 100MB

This is for admin-related tasks, so you could probably increase it.

Workload workload workload - we need to know what you're doing with it.
Once connection summarising the entire database will want larger numbers
than 100 connections running many small queries.

HTH
--
   Richard Huxton
   Archonet Ltd

Re: Recommended Initial Settings

From

"Campbell, Lance"

Date:

23 February 2007, 13:11:47

Richard,
Thanks for your reply.

You said:
"Your operating-system should be doing the caching for you."

My understanding is that as long as Linux has memory available it will
cache files.  Then from your comment I get the impression that since
Linux would be caching the data files for the postgres database it would
be redundant to have a large shared_buffers.  Did I understand you
correctly?

Thanks,

Lance Campbell
Project Manager/Software Architect
Web Services at Public Affairs
University of Illinois
217.333.0382
http://webservices.uiuc.edu

-----Original Message-----
From: Richard Huxton [mailto:dev@archonet.com]
Sent: Friday, February 23, 2007 10:29 AM
To: Campbell, Lance
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Recommended Initial Settings

Campbell, Lance wrote:
> I would like to get someone's recommendations on the best initial
> settings for a dedicated PostgreSQL server.  I do realize that there
are
> a lot of factors that influence how one should configure a database.
I
> am just looking for a good starting point.  Ideally I would like the
> database to reside as much as possible in memory with no disk access.
> The current database size of my 7.x version of PostgreSQL generates a
6
> Gig file when doing a database dump.

Your operating-system should be doing the caching for you.

> Dedicated PostgreSQL 8.2 Server
> Redhat Linux 4.x AS 64 bit version (EM64T)
> 4 Intel Xeon Processors

If these are older Xeons, check the mailing list archives for "xeon
context switch".

> 20 Gig Memory
> Current PostgreSQL database is 6 Gig file when doing a database dump

OK, so it's plausible the whole thing will fit in RAM (as a
rule-of-thumb I assume headers, indexes etc. triple or quadruple the
size). To know better, check the actual disk-usage of $PGDATA.

> /etc/sysctl.conf  file settings:
>
> # 11 Gig
>
> kernel.shmmax = 11811160064

Hmm - that's a lot of shared RAM. See shared_buffers below.

> kernel.sem = 250 32000 100 128
>
> net.ipv4.ip_local_port_range = 1024 65000
>
> net.core.rmem_default = 262144
>
> net.core.rmem_max = 262144
>
> net.core.wmem_default = 262144
>
> net.core.wmem_max = 262144

> postgresql.conf file settings (if not listed then I used the
defaults):
>
> max_connections = 300

How many connections do you expect typically/peak? It doesn't cost much
to have max_connections set high but your workload is the most important

thing missing from your question.

> shared_buffers = 10240MB

For 7.x that's probably way too big, but 8.x organises its buffers
better. I'd still be tempted to start a 1 or 2GB and work up - see where

it stops buying you an improvement.

> work_mem = 10MB

If you have large queries, doing big sorts I'd increase this. Don't
forget it's per-sort, so if you have got about 300 connections live at
any one time that could be 300*10MB*N if they're all doing something
complicated. If you only have one connection live, you can increase this

quite substantially.

> effective_cache_size = 512MB

This isn't setting PG's memory usage, it's telling PG how much data your

operating-system is caching. Check "free" and see what it says. For you,

I'd expect 10GB+.

> maintenance_work_mem = 100MB

This is for admin-related tasks, so you could probably increase it.

Workload workload workload - we need to know what you're doing with it.
Once connection summarising the entire database will want larger numbers

than 100 connections running many small queries.

HTH
--
   Richard Huxton
   Archonet Ltd

Re: Recommended Initial Settings

From

Richard Huxton

Date:

23 February 2007, 13:20:14

Campbell, Lance wrote:
> Richard,
> Thanks for your reply.
>
> You said:
> "Your operating-system should be doing the caching for you."
>
> My understanding is that as long as Linux has memory available it will
> cache files.  Then from your comment I get the impression that since
> Linux would be caching the data files for the postgres database it would
> be redundant to have a large shared_buffers.  Did I understand you
> correctly?

That's right - PG works with the O.S. This means it *might* not be a big
advantage to have a large shared_buffers.

On older versions of PG, the buffer management code wasn't great with
large shared_buffers values too.

--
   Richard Huxton
   Archonet Ltd

Re: Recommended Initial Settings

From

"Jim C. Nasby"

Date:

23 February 2007, 13:44:15

If you're doing much updating at all you'll also want to bump up
checkpoint_segments. I like setting checkpoint_warning just a bit under
checkpoint_timeout as a way to monitor how often you're checkpointing
due to running out of segments.

With a large shared_buffers you'll likely need to make the bgwriter more
aggressive as well (increase the max_pages numbers), though how
important that is depends on how much updating you're doing. If you see
periodic spikes in IO corresponding to checkpoints, that's an indication
bgwriter isn't doing a good enough job.

If everything ends up in memory, it might be good to decrease
random_page_cost to 1 or something close to it; though the database
should just rely on effective_cache to figure out that everything's in
memory.

If you're on pre-8.2, you'll want to cut all the autovacuum parameters
in half, if you're using it.
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Re: Recommended Initial Settings

From

Bill Moran

Date:

23 February 2007, 14:11:45

In response to "Campbell, Lance" <lance@uiuc.edu>:

> Richard,
> Thanks for your reply.
>
> You said:
> "Your operating-system should be doing the caching for you."
>
> My understanding is that as long as Linux has memory available it will
> cache files.  Then from your comment I get the impression that since
> Linux would be caching the data files for the postgres database it would
> be redundant to have a large shared_buffers.  Did I understand you
> correctly?

Keep in mind that keeping the data in the kernel's buffer requires
Postgres to make a syscall to read a file, which the kernel then realizes
is cached in memory.  The kernel then has to make that data available
to the Postgres (userland) process.

If the data is in Postgres' buffers, Postgres can fetch it directly, thus
avoiding the overhead of the syscalls and the kernel activity.  You still
have to make sysvshm calls, though.

So, it depends on which is able to manage the memory better.  Is the
kernel so much more efficient that it makes up for the overhead of the
syscalls?  My understanding is that in recent versions of Postgres,
this is not the case, and large shared_buffers improve performance.
I've yet to do any in-depth testing on this, though.

--
Bill Moran
Collaborative Fusion Inc.