Thread: Re: performance enhancements for PostgreSQL: update
--thanks for the reply all
--this is what i have so far:
--[snip]
> In an effort to enhance / streamline performance, I've done
> the following:
>
> * memory upgrade from 512M to 1G
> * move RAID5 to scsi drives (10K RPM)
> * set up cron script to vacuum database weekly
> * set number of client connects ( i.e., /usr/bin/postmaster -i -B 128 -N 64
> -d 4 )
--in the startup script, i launch / end postgres this way:
--[snip from startup]
su -l postgres -s /bin/sh -c "/usr/bin/pg_ctl -D $PGDATA -o '-i -B 128 -N 64 -d 2' \
-p /usr/bin/postmaster start >/dev/null "
--and--
su -l postgres -s /bin/sh -c "/usr/bin/pg_ctl stop -D $PGDATA -s -m fast" > /dev/null 2>&1
--[/snip]
--i have been trying to use the postgres.conf file but i haven't
--had much success with it. will put that up higher in the priority
--food chain.
--can someone explain what the benefits for that (allocating
--more buffer blocks) will be? rather, can someone post a link
--to the docs that explains why it would benefit me?
You could probably allocate WAY more buffer blocks than that. I run 4000
on most of my medium weight machines, and with a gig of ram you could get
away with quite a bit more, but you'll need to increads shmmax and shmall
to go very high. But I don't think that's your major problem.
--working on it now ... had to take the '-d 4' part of my start up options
--down to a '-d 2' ... the system was thinking / writing for too long just
--to test a simple query ...
More important, don't start the postmaster that way. Edit the
$PGDATA/postgresql.conf file, then use the pg_ctl command to start and
stop it.
> But I'm at the point now that I can't kill some jobs. Yes, I know
> I shouldn't use 'kill' in any forceful way, but just a kill seems to do
> nothing (or, if it is doing something, it's not fast enough
> for the user community and it's stopping production).
--i've seen it first hand why it's not good, but, i got this not
--too long ago:
--[excerpt from old email]
Use kill -9. Do a:
killall -9 postgres
killall -9 postmaster
The tip message
'Don't kill -9 the postmaster'
is old and was added when havoc could be caused by an old
postgres backend process. Consider the scenrio:
1. postmaster started
2. postgres started (Session #1)
3. postmaster killed (-9)
4. postmaster restarted
5. postgres started (Session #2)
Now there isn't any synchronization between Sesison #1 and
Session #2 at all, which would lead to data corruption. This
scenario was fixed a long time ago (7.1?). The whole 'Don't kill
-9 the postmaster' comment was actually a tongue-in-cheek remark
by me regarding a parallel discussion of RedHat init scripts.
The corruption possibility has long-since been fixed. Since I've
seen FUD claiming PostgreSQL doesn't have sufficient
crash-recovery because of the tip, I suggest the tip be changed to:
'Feel free to kill -9 the postmaster'
--[/excerpt]
--is this true? perhaps someone can verify this.
You can kill individual backends pretty safely, it's the postmaster you
can't kill -9 safely. If you see a postgres child running away with all
your memory etc... you can kill -9 that pid pretty safely.
> I got this from a co-worker:
>
> [snip]
>
> the server is blocking on access to the metadata tables. not even
> logins are being processed. I'm not sure what caused the
> problem, but I think a database restart is the best course,
> which I have been trying to do. Do not kill -9, as it will corrupt the
> WAL.
--yup. tried that and nothing changed for about 20 minutes or so ... so
--i had to reboot the server (i *hate* doing that).
If you want to shut down the server and it doesn't seem to respond to
pg_ctl stop, try 'pg_ctl -m fast stop' and see if that works.
--[snip raid errors]
--this is what the /proc/mdstat says
--[snip from /proc/mdstats]
Personalities :
read_ahead not set
unused devices: <none>
--[/snip from /proc/mdstats]
That looks like a dead drive in your RAID array. What does 'cat
/proc/mdstat' say about the drive
--i'm doing a linux software raid
(are you using linux software raid, or a hardware controller?)
> * have can I figure out how access to the metadata tables
> are being stopped? (my guess is the error on the scsi drive, but ... )
Sounds like processes are hanging, and users are just trying to
reconnect over and over and you're running out of connections.
This is a symptom, not the problem, which is your machine is
having issues.
[snip rest of email]
--i'm thinking more and more that while someone was trying to
--update / insert data into a table the scsi disk stopped writing
--(or just gave a lot of errors while writing). the thing is, *some* people
--could do work (albeit very little).
--at any rate, i'll have to investigate more later (probably the
--weekend) and stress test the array.
--also, i hope to get more messages in the log file with a
--better debug level to help track the problem.
--thanks again!
-X
On Mon, 18 Nov 2002, Johnson, Shaunn wrote: > su -l postgres -s /bin/sh -c "/usr/bin/pg_ctl -D $PGDATA -o '-i -B 128 -N 64 > -d 2' \ > -p /usr/bin/postmaster start >/dev/null " Doing it that way, you should be able to leave out the parts from -o on and get it to use the postgresql.conf file settings. > --is this true? perhaps someone can verify this. Generally speaking, while it still considered bad practice (or at least somewhat "rude" :-) to kill -9 the postmaster, it isn't particularly dangerous. I've done it while heavily testing a database (pgbench -c 100 -t 5000) and it never once corrupted my database. > --this is what the /proc/mdstat says > > --[snip from /proc/mdstats] > > Personalities : > read_ahead not set > unused devices: <none> > > --[/snip from /proc/mdstats] > > That looks like a dead drive in your RAID array. What does 'cat > /proc/mdstat' say about the drive > > --i'm doing a linux software raid Is there anymore to the /proc/mdstat entry? normally you should have a line that has a bit that names the drive partitions in the software raid and has a bit showing which drives are online that looks something like this: [UUUU] for all drives being UP. And like this: [UU_U] where the underscore shows a missing drive. If you don't have a line like that then Linux isn't doing the software RAID array. Could it be that you aren't actually running RAID but think you are? Just wondering.
Actually pg_ctl -D $PGDATA is redundant.... -D datadir Specifies the file system location of the database files. If this is omitted, the environment variable PGDATA is used. In fact if $PGDATA is for some reasong not loaded, you'll paint yourself to a bad spot where -D -o will be regarded as -o is the datadir location. scott.marlowe wrote: >On Mon, 18 Nov 2002, Johnson, Shaunn wrote: > > > >>su -l postgres -s /bin/sh -c "/usr/bin/pg_ctl -D $PGDATA -o '-i -B 128 -N 64 >>-d 2' \ >>-p /usr/bin/postmaster start >/dev/null " >> >> > >Doing it that way, you should be able to leave out the parts from -o on >and get it to use the postgresql.conf file settings. > > > >>--is this true? perhaps someone can verify this. >> >> > >Generally speaking, while it still considered bad practice (or at least >somewhat "rude" :-) to kill -9 the postmaster, it isn't particularly >dangerous. I've done it while heavily testing a database (pgbench -c 100 >-t 5000) and it never once corrupted my database. > > > >>--this is what the /proc/mdstat says >> >>--[snip from /proc/mdstats] >> >>Personalities : >>read_ahead not set >>unused devices: <none> >> >>--[/snip from /proc/mdstats] >> >>That looks like a dead drive in your RAID array. What does 'cat >>/proc/mdstat' say about the drive >> >>--i'm doing a linux software raid >> >> > >Is there anymore to the /proc/mdstat entry? normally you should have a >line that has a bit that names the drive partitions in the software raid >and has a bit showing which drives are online that looks something like >this: [UUUU] for all drives being UP. And like this: [UU_U] where the >underscore shows a missing drive. If you don't have a line like that >then Linux isn't doing the software RAID array. Could it be that you >aren't actually running RAID but think you are? Just wondering. > > >---------------------------(end of broadcast)--------------------------- >TIP 4: Don't 'kill -9' the postmaster > >