Thread: Slow Query / Check Point Segments
Hi
If It Exists, You'll Find it on SEEK Shopping Trolley Mechanic
i am experience slow queries when i run some functions. I noticed the following entries in my server log.
From this, can anyone tell me if I need to change some config parmeters?
System has 18GB Memory
shared_buffers = 4GB # min 128kB
temp_buffers = 32MB # min 800kB
max_prepared_transactions = 100 # zero disables the feature
work_mem = 256MB # min 64kB
maintenance_work_mem = 1GB # min 1MB
wal_buffers = 1024kB # min 32kB
checkpoint_segments = 32 # in logfile segments, min 1, 16MB each
checkpoint_timeout = 30min # range 30s-1h
2010-01-22 12:18:11 JSTLOG: checkpoint complete: wrote 52037 buffers (9.9%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=67.441 s, sync=0.000 s, total=67.453 s
2010-01-22 12:21:48 JSTLOG: checkpoint complete: wrote 83874 buffers (16.0%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=138.040 s, sync=0.000 s, total=138.063 s
2010-01-22 12:23:32 JSTLOG: checkpoint complete: wrote 82856 buffers (15.8%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=18.740 s, sync=0.000 s, total=18.783 s
2010-01-22 12:24:26 JSTLOG: checkpoint complete: wrote 75145 buffers (14.3%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=12.129 s, sync=0.000 s, total=12.132 s
2010-01-22 12:25:30 JSTLOG: checkpoint complete: wrote 82108 buffers (15.7%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=10.619 s, sync=0.000 s, total=10.621 s
2010-01-22 12:28:03 JSTLOG: checkpoint complete: wrote 87349 buffers (16.7%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=82.190 s, sync=0.000 s, total=82.192 s
2010-01-22 12:30:02 JSTLOG: checkpoint complete: wrote 80797 buffers (15.4%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=78.198 s, sync=0.000 s, total=78.201 s
2010-01-22 12:32:03 JSTLOG: checkpoint complete: wrote 81365 buffers (15.5%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=75.968 s, sync=0.000 s, total=75.971 s
2010-01-22 12:33:32 JSTLOG: checkpoint complete: wrote 98258 buffers (18.7%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=52.156 s, sync=0.000 s, total=52.159 s
2010-01-22 12:34:51 JSTLOG: checkpoint complete: wrote 80089 buffers (15.3%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=53.385 s, sync=0.000 s, total=53.387 s
2010-01-22 12:36:08 JSTLOG: checkpoint complete: wrote 80819 buffers (15.4%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=51.476 s, sync=0.000 s, total=51.478 s
2010-01-22 13:01:54 JSTLOG: checkpoint complete: wrote 4892 buffers (0.9%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=494.868 s, sync=0.000 s, total=494.982 s
Thanks for any advice
Alex
If It Exists, You'll Find it on SEEK Shopping Trolley Mechanic
On Thu, Jan 21, 2010 at 9:13 PM, Alex - <aintokyo@hotmail.com> wrote: > Hi > i am experience slow queries when i run some functions. I noticed the > following entries in my server log. > From this, can anyone tell me if I need to change some config parmeters? > System has 18GB Memory > shared_buffers = 4GB # min 128kB > temp_buffers = 32MB # min 800kB > max_prepared_transactions = 100 # zero disables the feature > work_mem = 256MB # min 64kB That's really big work_mem. What's max_connections? > maintenance_work_mem = 1GB # min 1MB That's pretty big too. > wal_buffers = 1024kB # min 32kB > checkpoint_segments = 32 # in logfile segments, min 1, 16MB > each > checkpoint_timeout = 30min # range 30s-1h > > > 2010-01-22 12:18:11 JSTLOG: checkpoint complete: wrote 52037 buffers > (9.9%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=67.441 > s, sync=0.000 s, total=67.453 s > 2010-01-22 12:21:48 JSTLOG: checkpoint complete: wrote 83874 buffers > (16.0%); 0 transaction log file(s) added, 0 removed, 32 recycled; > write=138.040 s, sync=0.000 s, total=138.063 s > 2010-01-22 12:23:32 JSTLOG: checkpoint complete: wrote 82856 buffers > (15.8%); 0 transaction log file(s) added, 0 removed, 32 recycled; > write=18.740 s, sync=0.000 s, total=18.783 s > 2010-01-22 12:24:26 JSTLOG: checkpoint complete: wrote 75145 buffers > (14.3%); 0 transaction log file(s) added, 0 removed, 32 recycled; > write=12.129 s, sync=0.000 s, total=12.132 s > 2010-01-22 12:25:30 JSTLOG: checkpoint complete: wrote 82108 buffers > (15.7%); 0 transaction log file(s) added, 0 removed, 32 recycled; > write=10.619 s, sync=0.000 s, total=10.621 s > 2010-01-22 12:28:03 JSTLOG: checkpoint complete: wrote 87349 buffers > (16.7%); 0 transaction log file(s) added, 0 removed, 32 recycled; > write=82.190 s, sync=0.000 s, total=82.192 s > 2010-01-22 12:30:02 JSTLOG: checkpoint complete: wrote 80797 buffers > (15.4%); 0 transaction log file(s) added, 0 removed, 32 recycled; > write=78.198 s, sync=0.000 s, total=78.201 s > 2010-01-22 12:32:03 JSTLOG: checkpoint complete: wrote 81365 buffers > (15.5%); 0 transaction log file(s) added, 0 removed, 32 recycled; > write=75.968 s, sync=0.000 s, total=75.971 s > 2010-01-22 12:33:32 JSTLOG: checkpoint complete: wrote 98258 buffers > (18.7%); 0 transaction log file(s) added, 0 removed, 32 recycled; > write=52.156 s, sync=0.000 s, total=52.159 s > 2010-01-22 12:34:51 JSTLOG: checkpoint complete: wrote 80089 buffers > (15.3%); 0 transaction log file(s) added, 0 removed, 32 recycled; > write=53.385 s, sync=0.000 s, total=53.387 s > 2010-01-22 12:36:08 JSTLOG: checkpoint complete: wrote 80819 buffers > (15.4%); 0 transaction log file(s) added, 0 removed, 32 recycled; > write=51.476 s, sync=0.000 s, total=51.478 s Up to now it looks pretty normal. Then: > 2010-01-22 13:01:54 JSTLOG: checkpoint complete: wrote 4892 buffers (0.9%); > 0 transaction log file(s) added, 0 removed, 32 recycled; write=494.868 s, > sync=0.000 s, total=494.982 s Wow that is a really long time for a checkpoint of only 0.9% to take. I'm gonna guess you went from a lot of inserts up above, then started running lots of queries that used lots of memory and / or vacuum chewed up a lot, and the OS swapped out your shared_buffers and they were getting swapped back in during the checkpoint. But that's just a guess. Could be other stuff. When this stuff happens, what does the output of vmstat 60 look like (not just the first line, run it a few minutes)?
Oh yeah, what's your swappiness setting (assuming you're running some flavor of linux: sysctl -a|grep swapp should tell you. I set it to something small like 5 or so on db servers. Default of 60 is fine for an interactive desktop but usually too high for a server.
Alex - wrote:
These parameters are not so interesting on their own. The important thing to check is how often checkpoints are happening, and how much work each of them do. Here's a useful section from your logs to comment on:
Here you're getting a checkpoint every minute or three, and each of them is writing out ~80000 buffers = 625MB. That is crazy, and no wonder your queries are slow--the system is spending all of its time doing constant, extremely expensive checkpoints.
You should re-tune this system until the checkpoints show up no more often than every 5 minutes, and preferably closer to 10. In your case, that means greatly increasing checkpoint_segments.. And you might as well decrease checkpoint_timeout, because right now the timeout is rarely ever being reached before the system runs out of working segments and executes a checkpoint for that reason.
I would suggest changing the parameters to these:
And seeing how long the interval between checkpoints becomes under load afterwards. That should make it in the just >5 minute range. I'd expect that the number of buffers will only increase a bit, so instead of dumping out 625MB every minute or three you should see maybe 800MB every 5 minutes--big improvement.
Also: just after making the change, save a copy of:
select now(),* from pg_stat_bgwriter
And then run that same query again a few hours after the change (and maybe the next day twoo). Subtract the two values to see how much they changed, and then you'll find some useful info to compute the checkpoint interval without even looking at the logs. There's a lot of theory and comments about this area at http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm
See that last one? That's a checkpoint that's being driven by time expiring (30 minutes) rather than running out of segments. That just suggests that the system was mostly idle during that period. Long write times are perfectly fine here, the checkpoints are spread out over a long time if possible in order to reduce average I/O. You're just not seeing that the rest of the time because checkpoints are happening so often.
checkpoint_segments = 32 # in logfile segments, min 1, 16MB eachcheckpoint_timeout = 30min # range 30s-1h
These parameters are not so interesting on their own. The important thing to check is how often checkpoints are happening, and how much work each of them do. Here's a useful section from your logs to comment on:
2010-01-22 12:21:48 JSTLOG: checkpoint complete: wrote 83874 buffers (16.0%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=138.040 s, sync=0.000 s, total=138.063 s2010-01-22 12:23:32 JSTLOG: checkpoint complete: wrote 82856 buffers (15.8%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=18.740 s, sync=0.000 s, total=18.783 s2010-01-22 12:24:26 JSTLOG: checkpoint complete: wrote 75145 buffers (14.3%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=12.129 s, sync=0.000 s, total=12.132 s2010-01-22 12:25:30 JSTLOG: checkpoint complete: wrote 82108 buffers (15.7%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=10.619 s, sync=0.000 s, total=10.621 s2010-01-22 12:28:03 JSTLOG: checkpoint complete: wrote 87349 buffers (16.7%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=82.190 s, sync=0.000 s, total=82.192 s
Here you're getting a checkpoint every minute or three, and each of them is writing out ~80000 buffers = 625MB. That is crazy, and no wonder your queries are slow--the system is spending all of its time doing constant, extremely expensive checkpoints.
You should re-tune this system until the checkpoints show up no more often than every 5 minutes, and preferably closer to 10. In your case, that means greatly increasing checkpoint_segments.. And you might as well decrease checkpoint_timeout, because right now the timeout is rarely ever being reached before the system runs out of working segments and executes a checkpoint for that reason.
I would suggest changing the parameters to these:
checkpoint_segments = 96 # in logfile segments, min 1, 16MB each
checkpoint_timeout = 10min # range 30s-1h
And seeing how long the interval between checkpoints becomes under load afterwards. That should make it in the just >5 minute range. I'd expect that the number of buffers will only increase a bit, so instead of dumping out 625MB every minute or three you should see maybe 800MB every 5 minutes--big improvement.
Also: just after making the change, save a copy of:
select now(),* from pg_stat_bgwriter
And then run that same query again a few hours after the change (and maybe the next day twoo). Subtract the two values to see how much they changed, and then you'll find some useful info to compute the checkpoint interval without even looking at the logs. There's a lot of theory and comments about this area at http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm
2010-01-22 12:36:08 JSTLOG: checkpoint complete: wrote 80819 buffers (15.4%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=51.476 s, sync=0.000 s, total=51.478 s2010-01-22 13:01:54 JSTLOG: checkpoint complete: wrote 4892 buffers (0.9%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=494.868 s, sync=0.000 s, total=494.982 s
See that last one? That's a checkpoint that's being driven by time expiring (30 minutes) rather than running out of segments. That just suggests that the system was mostly idle during that period. Long write times are perfectly fine here, the checkpoints are spread out over a long time if possible in order to reduce average I/O. You're just not seeing that the rest of the time because checkpoints are happening so often.
-- Greg Smith 2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.com
Greg,
Date: Fri, 22 Jan 2010 01:21:59 -0500
From: greg@2ndquadrant.com
To: aintokyo@hotmail.com
CC: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Slow Query / Check Point Segments
Alex - wrote:
These parameters are not so interesting on their own. The important thing to check is how often checkpoints are happening, and how much work each of them do. Here's a useful section from your logs to comment on:
Here you're getting a checkpoint every minute or three, and each of them is writing out ~80000 buffers = 625MB. That is crazy, and no wonder your queries are slow--the system is spending all of its time doing constant, extremely expensive checkpoints.
You should re-tune this system until the checkpoints show up no more often than every 5 minutes, and preferably closer to 10. In your case, that means greatly increasing checkpoint_segments.. And you might as well decrease checkpoint_timeout, because right now the timeout is rarely ever being reached before the system runs out of working segments and executes a checkpoint for that reason.
I would suggest changing the parameters to these:
And seeing how long the interval between checkpoints becomes under load afterwards. That should make it in the just >5 minute range. I'd expect that the number of buffers will only increase a bit, so instead of dumping out 625MB every minute or three you should see maybe 800MB every 5 minutes--big improvement.
Also: just after making the change, save a copy of:
select now(),* from pg_stat_bgwriter
And then run that same query again a few hours after the change (and maybe the next day twoo). Subtract the two values to see how much they changed, and then you'll find some useful info to compute the checkpoint interval without even looking at the logs. There's a lot of theory and comments about this area at http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm
See that last one? That's a checkpoint that's being driven by time expiring (30 minutes) rather than running out of segments. That just suggests that the system was mostly idle during that period. Long write times are perfectly fine here, the checkpoints are spread out over a long time if possible in order to reduce average I/O. You're just not seeing that the rest of the time because checkpoints are happening so often.
If It Exists, You'll Find it on SEEK Shopping Trolley Mechanic
thanks for the suggestions, will try an test.
You wrote:
That just suggests that the system was mostly idle during that period.
Thats actually not the case, it was just the time the function finished.
Alex
Date: Fri, 22 Jan 2010 01:21:59 -0500
From: greg@2ndquadrant.com
To: aintokyo@hotmail.com
CC: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Slow Query / Check Point Segments
Alex - wrote:
.ExternalClass .ecxhmmessage P {padding:0px;} .ExternalClass body.ecxhmmessage {font-size:10pt;font-family:Verdana;} checkpoint_segments = 32 # in logfile segments, min 1, 16MB eachcheckpoint_timeout = 30min # range 30s-1h
These parameters are not so interesting on their own. The important thing to check is how often checkpoints are happening, and how much work each of them do. Here's a useful section from your logs to comment on:
2010-01-22 12:21:48 JSTLOG: checkpoint complete: wrote 83874 buffers (16.0%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=138.040 s, sync=0.000 s, total=138.063 s2010-01-22 12:23:32 JSTLOG: checkpoint complete: wrote 82856 buffers (15.8%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=18.740 s, sync=0.000 s, total=18.783 s2010-01-22 12:24:26 JSTLOG: checkpoint complete: wrote 75145 buffers (14.3%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=12.129 s, sync=0.000 s, total=12.132 s2010-01-22 12:25:30 JSTLOG: checkpoint complete: wrote 82108 buffers (15.7%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=10.619 s, sync=0.000 s, total=10.621 s2010-01-22 12:28:03 JSTLOG: checkpoint complete: wrote 87349 buffers (16.7%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=82.190 s, sync=0.000 s, total=82.192 s
Here you're getting a checkpoint every minute or three, and each of them is writing out ~80000 buffers = 625MB. That is crazy, and no wonder your queries are slow--the system is spending all of its time doing constant, extremely expensive checkpoints.
You should re-tune this system until the checkpoints show up no more often than every 5 minutes, and preferably closer to 10. In your case, that means greatly increasing checkpoint_segments.. And you might as well decrease checkpoint_timeout, because right now the timeout is rarely ever being reached before the system runs out of working segments and executes a checkpoint for that reason.
I would suggest changing the parameters to these:
checkpoint_segments = 96 # in logfile segments, min 1, 16MB each
checkpoint_timeout = 10min # range 30s-1h
And seeing how long the interval between checkpoints becomes under load afterwards. That should make it in the just >5 minute range. I'd expect that the number of buffers will only increase a bit, so instead of dumping out 625MB every minute or three you should see maybe 800MB every 5 minutes--big improvement.
Also: just after making the change, save a copy of:
select now(),* from pg_stat_bgwriter
And then run that same query again a few hours after the change (and maybe the next day twoo). Subtract the two values to see how much they changed, and then you'll find some useful info to compute the checkpoint interval without even looking at the logs. There's a lot of theory and comments about this area at http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm
2010-01-22 12:36:08 JSTLOG: checkpoint complete: wrote 80819 buffers (15.4%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=51.476 s, sync=0.000 s, total=51.478 s2010-01-22 13:01:54 JSTLOG: checkpoint complete: wrote 4892 buffers (0.9%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=494.868 s, sync=0.000 s, total=494.982 s
See that last one? That's a checkpoint that's being driven by time expiring (30 minutes) rather than running out of segments. That just suggests that the system was mostly idle during that period. Long write times are perfectly fine here, the checkpoints are spread out over a long time if possible in order to reduce average I/O. You're just not seeing that the rest of the time because checkpoints are happening so often.
-- Greg Smith 2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.com
If It Exists, You'll Find it on SEEK Shopping Trolley Mechanic
Greg Smith wrote: >> 2010-01-22 12:21:48 JSTLOG: checkpoint complete: wrote 83874 buffers >> (16.0%); 0 transaction log file(s) added, 0 removed, 32 recycled; >> write=138.040 s, sync=0.000 s, total=138.063 s >> 2010-01-22 12:23:32 JSTLOG: checkpoint complete: wrote 82856 buffers >> (15.8%); 0 transaction log file(s) added, 0 removed, 32 recycled; >> write=18.740 s, sync=0.000 s, total=18.783 s >> 2010-01-22 12:24:26 JSTLOG: checkpoint complete: wrote 75145 buffers >> (14.3%); 0 transaction log file(s) added, 0 removed, 32 recycled; >> write=12.129 s, sync=0.000 s, total=12.132 s >> 2010-01-22 12:25:30 JSTLOG: checkpoint complete: wrote 82108 buffers >> (15.7%); 0 transaction log file(s) added, 0 removed, 32 recycled; >> write=10.619 s, sync=0.000 s, total=10.621 s >> 2010-01-22 12:28:03 JSTLOG: checkpoint complete: wrote 87349 buffers >> (16.7%); 0 transaction log file(s) added, 0 removed, 32 recycled; >> write=82.190 s, sync=0.000 s, total=82.192 s > > Here you're getting a checkpoint every minute or three, and each of > them is writing out ~80000 buffers = 625MB. That is crazy, and no > wonder your queries are slow--the system is spending all of its time > doing constant, extremely expensive checkpoints. > ..... we're having a similar problem with a very update intensive database that is part of a 24/7 manufacturing operation (no breathing time unless there's an unusual line down situtation) a snip of the log... 2010-01-23 01:02:19 MYTLOG: checkpoint complete: wrote 764 buffers (0.3%); 0 transaction log file(s) added, 0 removed, 20 recycled; write=31.207 s, sync=0.147 s, total=31.394 s 2010-01-23 01:08:13 MYTLOG: checkpoint complete: wrote 7535 buffers (2.9%); 0 transaction log file(s) added, 0 removed, 20 recycled; write=86.510 s, sync=0.260 s, total=86.791 s 2010-01-23 01:13:22 MYTLOG: checkpoint complete: wrote 12786 buffers (4.9%); 0 transaction log file(s) added, 0 removed, 17 recycled; write=95.028 s, sync=0.135 s, total=95.182 s 2010-01-23 01:16:22 MYTLOG: checkpoint complete: wrote 11720 buffers (4.5%); 0 transaction log file(s) added, 0 removed, 19 recycled; write=41.854 s, sync=0.640 s, total=42.518 s 2010-01-23 01:19:11 MYTLOG: checkpoint complete: wrote 10436 buffers (4.0%); 0 transaction log file(s) added, 0 removed, 20 recycled; write=50.330 s, sync=0.510 s, total=50.861 s 2010-01-23 01:23:19 MYTLOG: checkpoint complete: wrote 6446 buffers (2.5%); 0 transaction log file(s) added, 0 removed, 20 recycled; write=57.195 s, sync=0.583 s, total=57.790 s 2010-01-23 01:27:54 MYTLOG: checkpoint complete: wrote 4454 buffers (1.7%); 0 transaction log file(s) added, 0 removed, 20 recycled; write=44.805 s, sync=0.118 s, total=44.943 s 2010-01-23 01:32:36 MYTLOG: checkpoint complete: wrote 4645 buffers (1.8%); 0 transaction log file(s) added, 0 removed, 20 recycled; write=50.792 s, sync=0.689 s, total=51.502 s 2010-01-23 01:37:13 MYTLOG: checkpoint complete: wrote 5722 buffers (2.2%); 0 transaction log file(s) added, 0 removed, 20 recycled; write=58.077 s, sync=0.606 s, total=58.707 s 2010-01-23 01:41:20 MYTLOG: checkpoint complete: wrote 5006 buffers (1.9%); 0 transaction log file(s) added, 0 removed, 20 recycled; write=36.064 s, sync=0.898 s, total=36.986 s 2010-01-23 01:46:14 MYTLOG: checkpoint complete: wrote 3644 buffers (1.4%); 0 transaction log file(s) added, 0 removed, 20 recycled; write=54.873 s, sync=0.659 s, total=55.554 s 2010-01-23 01:51:15 MYTLOG: checkpoint complete: wrote 5534 buffers (2.1%); 0 transaction log file(s) added, 0 removed, 20 recycled; write=63.152 s, sync=0.374 s, total=63.546 s The admin tells me the current parameters are... checkpoint_segments = 20 # in logfile segments, min 1, 16MB eac checkpoint_timeout = 5min # range 30s-1h and I'm suggesting 60 and 15min, does that seem reasonable? This is running on a 16 core UltrasparcIV server, solaris 10, zfs, tons of disks on SAN with multiple tablespaces on different ZFS's, with the block sizes tuned appropriately
John R Pierce wrote: > we're having a similar problem with a very update intensive database > that is part of a 24/7 manufacturing operation (no breathing time > unless there's an unusual line down situtation) Your problem is actually a bit different. > 2010-01-23 01:08:13 MYTLOG: checkpoint complete: wrote 7535 buffers > (2.9%); 0 transaction log file(s) added, 0 removed, 20 recycled; > write=86.510 s, sync=0.260 s, total=86.791 s > 2010-01-23 01:13:22 MYTLOG: checkpoint complete: wrote 12786 buffers > (4.9%); 0 transaction log file(s) added, 0 removed, 17 recycled; > write=95.028 s, sync=0.135 s, total=95.182 s > 2010-01-23 01:16:22 MYTLOG: checkpoint complete: wrote 11720 buffers > (4.5%); 0 transaction log file(s) added, 0 removed, 19 recycled; > write=41.854 s, sync=0.640 s, total=42.518 s > 2010-01-23 01:19:11 MYTLOG: checkpoint complete: wrote 10436 buffers > (4.0%); 0 transaction log file(s) added, 0 removed, 20 recycled; > write=50.330 s, sync=0.510 s, total=50.861 s > Your interval between checkpoints isn't too bad--3 minutes on the worst of these. And the amount per checkpoint isn't that high either: 12786 buffers=100MB. Your problem is that the sync on that 100MB write is taking an awfully long time. Spacing your checkpoints out further would help, sure, but I wouldn't be surprised to find that doesn't do much for you. Worth a shot, unlikely to hurt anything, just not optimistic about it being the right think the way I was on the other message. > This is running on a 16 core UltrasparcIV server, solaris 10, zfs, > tons of disks on SAN with multiple tablespaces on different ZFS's, > with the block sizes tuned appropriately My guess is that there's something wrong with your config such that writes followed by fsync are taking longer than they should. When I see "sync=0.640 s" into a SAN where that sync operation should be near instant, I'd be looking for issues in the ZFS intent log setup, how much data you've setup ZFS to write cache, and the write cache policy on the SAN hardware. There's something weird going on here, that sync should be near instant on your hardware with such a small write volume. Unfortunately, you may have to take the server down to find out exactly what's going on, which doesn't sound practical for your environment. (And people wonder why I have near religious zeal about testing disk hardware before systems go into production) -- Greg Smith 2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.com
Greg Smith wrote: > My guess is that there's something wrong with your config such that > writes followed by fsync are taking longer than they should. When I > see "sync=0.640 s" into a SAN where that sync operation should be near > instant, I'd be looking for issues in the ZFS intent log setup, how > much data you've setup ZFS to write cache, and the write cache policy > on the SAN hardware. There's something weird going on here, that sync > should be near instant on your hardware with such a small write volume. > > Unfortunately, you may have to take the server down to find out > exactly what's going on, which doesn't sound practical for your > environment. (And people wonder why I have near religious zeal about > testing disk hardware before systems go into production) there's a further complication. this system is deep in southeast asia on a rather slow and saturated network connection such that trying to figure out stuff about it is challenging at best. I know the database has a lot of write volume overall, and its only one of several databases running in different zones on the server. I know nothing about the SAN, I suspect its a EMC Symmetrix of some sort. Probably a generation or two behind latest. The operations people are used to running large oracle databases.
John R Pierce wrote: > I know the database has a lot of write volume overall, and its only > one of several databases running in different zones on the server. I > know nothing about the SAN, I suspect its a EMC Symmetrix of some > sort. Probably a generation or two behind latest. The operations > people are used to running large oracle databases. One thing you might try is making the PostgreSQL install act more like an Oracle one in terms of how it does writes. By default, PostgreSQL does its WAL writes by writing and then calling a sync method. On Solaris, you should be able to safely change this in the postgresql.conf file to be: wal_sync_method=open_datasync Which I don't think is the default (you can confirm with "show wal_sync_method;" via psql on your database). That will use O_DSYNC writes, which are more like how Oracle approaches this and therefore potentially a better tuned path for your install. More on this subject, including idea for further tweaking http://www.postgresql.org/docs/8.4/static/runtime-config-wal.html http://blogs.sun.com/jkshah/entry/postgresql_wal_sync_method_and http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm From what you've shown and described, I'm not sure what other PostgreSQL tuning you might do to improve the specific symptoms you're seeing. The particular issue you've got I'd normally start attacking on the filesystem and hardware side of things. There's a reason why many people avoid SANs in this context, there's a lot of ways you can screw up in this particular aspect of their performance relative to what you get with direct attached storage, and it's hard to tell which you've run into when a problem does pop up. -- Greg Smith 2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.com
Greg Smith wrote: > John R Pierce wrote: >> I know the database has a lot of write volume overall, and its only >> one of several databases running in different zones on the server. I >> know nothing about the SAN, I suspect its a EMC Symmetrix of some >> sort. Probably a generation or two behind latest. The operations >> people are used to running large oracle databases. > > One thing you might try is making the PostgreSQL install act more like > an Oracle one in terms of how it does writes. By default, PostgreSQL > does its WAL writes by writing and then calling a sync method. On > Solaris, you should be able to safely change this in the > postgresql.conf file to be: > iostat -x 5 or 15 shows a bunch of LUNs are as much as 10% busy, but the service time stays under 50ms... this is one of the busiest of the 15 second samples over a 10 min period I'm just showing a representive sampling of the busiest LUNs out of 40, the physical storage is all raid10's on a DMX4. extended device statistics device r/s w/s kr/s kw/s wait actv svc_t %w %b sd1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 ... ssd10 0.0 33.2 0.0 546.6 0.0 1.0 28.9 0 7 ssd11 0.0 27.7 0.0 573.9 0.0 0.9 34.0 0 6 ssd12 0.0 56.2 0.0 576.6 0.0 1.5 26.5 0 10 ssd13 0.0 30.8 0.0 505.5 0.0 0.9 28.7 0 7 ssd14 0.0 42.9 0.0 498.4 0.0 1.4 32.9 0 9 ssd15 0.0 38.3 0.0 557.6 0.0 1.7 44.0 0 8 ssd16 0.0 41.1 0.0 520.5 0.0 1.3 32.0 0 9 ssd17 0.0 52.4 0.0 528.6 0.0 1.3 24.8 0 10 ssd18 0.0 29.0 0.0 503.9 0.0 1.2 41.4 0 7 ... most show near 0% busy and 10/th that volume of writes or reads. zpool iostat 15 also shows this io peak for about a minute or two every few minutes but its nowhere near hardware capacity $ zpool iostat 15 capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- data-p1 268G 101G 109 386 1.16M 6.98M data-p2 340M 33.2G 0 2 491 36.8K ---------- ----- ----- ----- ----- ----- ----- .............. ---------- ----- ----- ----- ----- ----- ----- data-p1 268G 101G 40 6 527K 164K data-p2 341M 33.2G 0 0 0 0 ---------- ----- ----- ----- ----- ----- ----- data-p1 268G 101G 36 1.35K 306K 19.2M data-p2 341M 33.2G 0 0 0 0 ---------- ----- ----- ----- ----- ----- ----- data-p1 268G 101G 45 585 445K 2.01M data-p2 341M 33.2G 0 0 0 0 ---------- ----- ----- ----- ----- ----- ----- data-p1 268G 101G 47 38 410K 1.05M data-p2 341M 33.2G 0 0 0 0 ---------- ----- ----- ----- ----- ----- ----- data-p1 268G 101G 68 145 745K 4.11M data-p2 341M 33.2G 0 0 0 0 ---------- ----- ----- ----- ----- ----- ----- data-p1 268G 101G 38 168 311K 4.60M data-p2 340M 33.2G 0 9 0 294K ---------- ----- ----- ----- ----- ----- ----- data-p1 268G 101G 55 65 504K 1.18M data-p2 340M 33.2G 0 0 0 0 ---------- ----- ----- ----- ----- ----- ----- data-p1 268G 101G 5 1.07K 43.1K 9.71M data-p2 340M 33.2G 0 0 0 0 ---------- ----- ----- ----- ----- ----- ----- data-p1 268G 101G 46 7 549K 179K data-p2 340M 33.2G 0 0 0 0 ---------- ----- ----- ----- ----- ----- ----- ^C > wal_sync_method=open_datasync > > Which I don't think is the default (you can confirm with "show > wal_sync_method;" via psql on your database). That will use O_DSYNC > writes, which are more like how Oracle approaches this and therefore > potentially a better tuned path for your install. thats what its set to now. > > More on this subject, including idea for further tweaking > > http://www.postgresql.org/docs/8.4/static/runtime-config-wal.html > http://blogs.sun.com/jkshah/entry/postgresql_wal_sync_method_and > http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm > > From what you've shown and described, I'm not sure what other > PostgreSQL tuning you might do to improve the specific symptoms you're > seeing. The particular issue you've got I'd normally start attacking > on the filesystem and hardware side of things. There's a reason why > many people avoid SANs in this context, there's a lot of ways you can > screw up in this particular aspect of their performance relative to > what you get with direct attached storage, and it's hard to tell which > you've run into when a problem does pop up. > The SAN is a dmx4, and it seems to be damn fast by all low level tests i can run. this database is really getting hammered on. a couple of the tables take 8-12 hours to vacuum. these are tables with many millions of small rows that are updated randomly at a high speed, using a rather large primary key (the PK is 6 or 8 short text fields, the 'data' is a couple counters). The performance got a lot better when we set these tables for fill factor of 50 but the table size doubled (duh) which apparently produced its own problems so they are trying fill factor 70. there's probably 100 connections. cpu and vm usage for a couple minutes... $ vmstat 15 kthr memory page disk faults cpu r b w swap free re mf pi po fr de sr s1 sd sd sd in sy cs us sy id 2 0 0 30980016 1555616 20 260 40 0 0 0 0 -0 4 4 5 2231 21989 11168 25 4 72 0 0 0 30765664 1227536 57 368 185 0 0 0 0 0 8 13 5 1366 64055 48764 34 7 59 2 0 0 30757576 1218784 0 12 0 0 0 0 0 0 8 8 9 7137 82432 58548 53 12 36 3 0 0 30750528 1211736 0 15 0 0 0 0 0 0 39 44 35 2959 85558 64763 40 10 51 0 0 0 30745624 1206832 0 9 0 0 0 0 0 0 0 0 1 1253 65697 51775 32 7 62 0 0 0 30739896 1201104 21 127 97 0 0 0 0 0 3 2 5 1271 67217 52826 34 7 59 1 0 0 30729832 1191592 32 296 0 0 0 0 0 0 0 1 2 5030 82152 53824 45 12 43 3 0 0 30722528 1184328 28 239 0 0 0 0 0 0 11 10 15 2201 87922 66675 38 10 52 1 0 0 30720800 1181984 0 12 0 0 0 0 0 0 36 34 60 1970 68696 52730 38 8 54 0 0 0 30716152 1177320 34 248 259 0 0 0 0 0 3 3 10 1139 71591 55853 31 8 61 3 0 0 30708216 1169312 0 13 0 0 0 0 0 0 6 6 6 9114 90349 60455 56 15 29 5 0 0 30700448 1161544 0 15 0 0 0 0 0 0 29 29 36 3196 61634 34290 47 11 41 0 0 0 30694528 1155624 0 10 0 0 0 0 0 0 35 35 32 1343 62273 48045 33 7 60 0 0 0 30686688 1148592 36 282 171 0 0 0 0 0 2 2 5 1068 75579 59492 30 8 62 I'm thinking its zfs tuning we need to do, not physical disk IO, and thats a topic for another list, I guess. I know quite a lot has been done, using different zfs block sizes for different tablespaces, putting index and data in different tablespaces, and so forth.
Hi,
Please have a look at the functions also - what are they doing? In Oracle I have come across a situation where a lot of redo log was generated because one SQL was updating an entire table, instead of a few selected records. Since the new data was the same as the old data (for records which shouldn't have been updated in the first place), the functionality of the application was fine. But it was generating a lot of redo log (similar to WAL in PostgreSQL) and causing serious performance issues because of the redo generation, buffer getting filled up quickly and so on.
Jayadevan
From: Alex - <aintokyo@hotmail.com>
To: <pgsql-general@postgresql.org>
Date: 01/22/2010 09:42 AM
Subject: [GENERAL] Slow Query / Check Point Segments
Sent by: pgsql-general-owner@postgresql.org
Hi
i am experience slow queries when i run some functions. I noticed the following entries in my server log.
From this, can anyone tell me if I need to change some config parmeters?
System has 18GB Memory
shared_buffers = 4GB # min 128kB
temp_buffers = 32MB # min 800kB
max_prepared_transactions = 100 # zero disables the feature
work_mem = 256MB # min 64kB
maintenance_work_mem = 1GB # min 1MB
wal_buffers = 1024kB # min 32kB
checkpoint_segments = 32 # in logfile segments, min 1, 16MB each
checkpoint_timeout = 30min # range 30s-1h
2010-01-22 12:18:11 JSTLOG: checkpoint complete: wrote 52037 buffers (9.9%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=67.441 s, sync=0.000 s, total=67.453 s
2010-01-22 12:21:48 JSTLOG: checkpoint complete: wrote 83874 buffers (16.0%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=138.040 s, sync=0.000 s, total=138.063 s
2010-01-22 12:23:32 JSTLOG: checkpoint complete: wrote 82856 buffers (15.8%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=18.740 s, sync=0.000 s, total=18.783 s
2010-01-22 12:24:26 JSTLOG: checkpoint complete: wrote 75145 buffers (14.3%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=12.129 s, sync=0.000 s, total=12.132 s
2010-01-22 12:25:30 JSTLOG: checkpoint complete: wrote 82108 buffers (15.7%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=10.619 s, sync=0.000 s, total=10.621 s
2010-01-22 12:28:03 JSTLOG: checkpoint complete: wrote 87349 buffers (16.7%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=82.190 s, sync=0.000 s, total=82.192 s
2010-01-22 12:30:02 JSTLOG: checkpoint complete: wrote 80797 buffers (15.4%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=78.198 s, sync=0.000 s, total=78.201 s
2010-01-22 12:32:03 JSTLOG: checkpoint complete: wrote 81365 buffers (15.5%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=75.968 s, sync=0.000 s, total=75.971 s
2010-01-22 12:33:32 JSTLOG: checkpoint complete: wrote 98258 buffers (18.7%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=52.156 s, sync=0.000 s, total=52.159 s
2010-01-22 12:34:51 JSTLOG: checkpoint complete: wrote 80089 buffers (15.3%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=53.385 s, sync=0.000 s, total=53.387 s
2010-01-22 12:36:08 JSTLOG: checkpoint complete: wrote 80819 buffers (15.4%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=51.476 s, sync=0.000 s, total=51.478 s
2010-01-22 13:01:54 JSTLOG: checkpoint complete: wrote 4892 buffers (0.9%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=494.868 s, sync=0.000 s, total=494.982 s
Thanks for any advice
Alex
If It Exists, You'll Find it on SEEK Shopping Trolley Mechanic

DISCLAIMER:
"The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect."
Please have a look at the functions also - what are they doing? In Oracle I have come across a situation where a lot of redo log was generated because one SQL was updating an entire table, instead of a few selected records. Since the new data was the same as the old data (for records which shouldn't have been updated in the first place), the functionality of the application was fine. But it was generating a lot of redo log (similar to WAL in PostgreSQL) and causing serious performance issues because of the redo generation, buffer getting filled up quickly and so on.
Jayadevan
From: Alex - <aintokyo@hotmail.com>
To: <pgsql-general@postgresql.org>
Date: 01/22/2010 09:42 AM
Subject: [GENERAL] Slow Query / Check Point Segments
Sent by: pgsql-general-owner@postgresql.org
Hi
i am experience slow queries when i run some functions. I noticed the following entries in my server log.
From this, can anyone tell me if I need to change some config parmeters?
System has 18GB Memory
shared_buffers = 4GB # min 128kB
temp_buffers = 32MB # min 800kB
max_prepared_transactions = 100 # zero disables the feature
work_mem = 256MB # min 64kB
maintenance_work_mem = 1GB # min 1MB
wal_buffers = 1024kB # min 32kB
checkpoint_segments = 32 # in logfile segments, min 1, 16MB each
checkpoint_timeout = 30min # range 30s-1h
2010-01-22 12:18:11 JSTLOG: checkpoint complete: wrote 52037 buffers (9.9%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=67.441 s, sync=0.000 s, total=67.453 s
2010-01-22 12:21:48 JSTLOG: checkpoint complete: wrote 83874 buffers (16.0%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=138.040 s, sync=0.000 s, total=138.063 s
2010-01-22 12:23:32 JSTLOG: checkpoint complete: wrote 82856 buffers (15.8%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=18.740 s, sync=0.000 s, total=18.783 s
2010-01-22 12:24:26 JSTLOG: checkpoint complete: wrote 75145 buffers (14.3%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=12.129 s, sync=0.000 s, total=12.132 s
2010-01-22 12:25:30 JSTLOG: checkpoint complete: wrote 82108 buffers (15.7%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=10.619 s, sync=0.000 s, total=10.621 s
2010-01-22 12:28:03 JSTLOG: checkpoint complete: wrote 87349 buffers (16.7%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=82.190 s, sync=0.000 s, total=82.192 s
2010-01-22 12:30:02 JSTLOG: checkpoint complete: wrote 80797 buffers (15.4%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=78.198 s, sync=0.000 s, total=78.201 s
2010-01-22 12:32:03 JSTLOG: checkpoint complete: wrote 81365 buffers (15.5%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=75.968 s, sync=0.000 s, total=75.971 s
2010-01-22 12:33:32 JSTLOG: checkpoint complete: wrote 98258 buffers (18.7%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=52.156 s, sync=0.000 s, total=52.159 s
2010-01-22 12:34:51 JSTLOG: checkpoint complete: wrote 80089 buffers (15.3%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=53.385 s, sync=0.000 s, total=53.387 s
2010-01-22 12:36:08 JSTLOG: checkpoint complete: wrote 80819 buffers (15.4%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=51.476 s, sync=0.000 s, total=51.478 s
2010-01-22 13:01:54 JSTLOG: checkpoint complete: wrote 4892 buffers (0.9%); 0 transaction log file(s) added, 0 removed, 32 recycled; write=494.868 s, sync=0.000 s, total=494.982 s
Thanks for any advice
Alex
If It Exists, You'll Find it on SEEK Shopping Trolley Mechanic

DISCLAIMER:
"The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect."