weird performances problem - Mailing list pgsql-performance
From | Guillaume Smet |
---|---|
Subject | weird performances problem |
Date | |
Msg-id | 437CC21D.8070402@openwide.fr Whole thread Raw |
Responses |
Re: weird performances problem
|
List | pgsql-performance |
Hi all, We are operating a 1.5GB postgresql database for a year and we have problems for nearly a month. Usually everything is OK with the database, queries are executed fast even if they are complicated but sometimes and for half an hour, we have a general slow down. The server is a dedicated quad xeon with 4GB and a RAID1 array for the system and a RAID10 one for postgresql data. We have very few updates/inserts/deletes during the day. Postgresql version is 7.4.8. - the database is vacuumed, analyzed regularly (but we are not using autovacuum) and usually performs pretty well ; - IOs are OK, the database is entirely in RAM (see top.txt and iostat.txt attached) ; - CPUs are usually 25% idle, load is never really growing and its maximum is below 5 ; - I attached two plans for a simple query, the one is what we have when the server is fast, the other when we have a slow down: it's exactly the same plan but, as you can see it, the time to fetch the first row from indexes is quite high for the slow query ; - during this slow down, queries that usually take 500ms can take up to 60 seconds (and sometimes even more) ; - we have up to 130 permanent connections from our apache servers during this slow down as we have a lot of apache children waiting for a response. I attached a vmstat output. Context switches are quite high but I'm not sure it can be called a context switch storm and that this is the cause of our problem. Thanks for any advice or idea to help us understanding this problem and hopefully solve it. Regards, -- Guillaume [root@bd root]# iostat 10 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 7.20 0.00 92.00 0 920 sda1 0.00 0.00 0.00 0 0 sda2 6.40 0.00 78.40 0 784 sda3 0.00 0.00 0.00 0 0 sda4 0.00 0.00 0.00 0 0 sda5 0.00 0.00 0.00 0 0 sda6 0.80 0.00 13.60 0 136 sdb 5.00 0.00 165.60 0 1656 sdb1 5.00 0.00 165.60 0 1656 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 1.30 0.00 20.80 0 208 sda1 0.00 0.00 0.00 0 0 sda2 0.70 0.00 9.60 0 96 sda3 0.00 0.00 0.00 0 0 sda4 0.00 0.00 0.00 0 0 sda5 0.00 0.00 0.00 0 0 sda6 0.60 0.00 11.20 0 112 sdb 5.40 0.00 173.60 0 1736 sdb1 5.40 0.00 173.60 0 1736 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 2.20 0.00 28.00 0 280 sda1 0.00 0.00 0.00 0 0 sda2 2.20 0.00 28.00 0 280 sda3 0.00 0.00 0.00 0 0 sda4 0.00 0.00 0.00 0 0 sda5 0.00 0.00 0.00 0 0 sda6 0.00 0.00 0.00 0 0 sdb 5.20 0.00 171.20 0 1712 sdb1 5.20 0.00 171.20 0 1712 QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..13.52 rows=2 width=1119) (actual time=0.154..0.167 rows=1 loops=1) -> Index Scan using pk_newslang on newslang nl (cost=0.00..3.87 rows=1 width=1004) (actual time=0.053..0.055 rows=1loops=1) Index Cond: (((codelang)::text = 'FRA'::text) AND (3498704 = numnews)) -> Nested Loop Left Join (cost=0.00..9.61 rows=2 width=119) (actual time=0.050..0.059 rows=1 loops=1) -> Index Scan using pk_news on news n (cost=0.00..3.31 rows=2 width=98) (actual time=0.021..0.023 rows=1 loops=1) Index Cond: (numnews = 3498704) -> Index Scan using pk_photo on photo p (cost=0.00..3.14 rows=1 width=25) (actual time=0.021..0.025 rows=1 loops=1) Index Cond: (p.numphoto = "outer".numphoto) Total runtime: 0.320 ms (9 rows) QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------ Nested Loop (cost=0.00..13.52 rows=2 width=1119) (actual time=155.286..155.305 rows=1 loops=1) -> Index Scan using pk_newslang on newslang nl (cost=0.00..3.87 rows=1 width=1004) (actual time=44.575..44.579 rows=1loops=1) Index Cond: (((codelang)::text = 'FRA'::text) AND (3498704 = numnews)) -> Nested Loop Left Join (cost=0.00..9.61 rows=2 width=119) (actual time=110.648..110.660 rows=1 loops=1) -> Index Scan using pk_news on news n (cost=0.00..3.31 rows=2 width=98) (actual time=0.169..0.174 rows=1 loops=1) Index Cond: (numnews = 3498704) -> Index Scan using pk_photo on photo p (cost=0.00..3.14 rows=1 width=25) (actual time=110.451..110.454 rows=1loops=1) Index Cond: (p.numphoto = "outer".numphoto) Total runtime: 155.514 ms (9 rows) 17:08:41 up 19 days, 15:16, 1 user, load average: 4.03, 4.26, 4.36 288 processes: 285 sleeping, 3 running, 0 zombie, 0 stopped CPU states: cpu user nice system irq softirq iowait idle total 59.0% 0.0% 8.8% 0.2% 0.0% 0.0% 31.9% cpu00 52.3% 0.0% 13.3% 0.9% 0.0% 0.0% 33.3% cpu01 65.7% 0.0% 7.6% 0.0% 0.0% 0.0% 26.6% cpu02 58.0% 0.0% 7.6% 0.0% 0.0% 0.0% 34.2% cpu03 60.0% 0.0% 6.6% 0.0% 0.0% 0.0% 33.3% Mem: 3857224k av, 3495880k used, 361344k free, 0k shrd, 92160k buff 2374048k actv, 463576k in_d, 37708k in_c Swap: 4281272k av, 25412k used, 4255860k free 2173392k cached [root@bd root]# vmstat 10 procs memory swap io system cpu r b swpd free buff cache si so bi bo in cs us sy id wa 4 0 25412 250160 91736 2170944 0 1 3 3 4 3 5 3 3 0 7 0 25412 247160 91736 2171024 0 0 0 116 588 4483 65 4 31 0 3 0 25412 235456 91752 2171132 0 0 0 129 491 3670 70 4 26 0 5 0 25412 233696 91760 2171216 0 0 0 152 530 4768 61 4 34 0 5 0 25412 233248 91768 2171232 0 0 0 183 624 5379 59 5 36 0 9 0 25412 195332 91788 2171304 0 0 0 127 541 4811 58 5 37 0
pgsql-performance by date: