StrategyGetBuffer optimization, take 2 - Mailing list pgsql-hackers
From | Merlin Moncure |
---|---|
Subject | StrategyGetBuffer optimization, take 2 |
Date | |
Msg-id | CAHyXU0wEhbTk6eGcw4fEiR4ZdaRSKp7CBSRWp_6qJE1QV1yPcg@mail.gmail.com Whole thread Raw |
Responses |
Re: StrategyGetBuffer optimization, take 2
Re: StrategyGetBuffer optimization, take 2 Re: StrategyGetBuffer optimization, take 2 Re: StrategyGetBuffer optimization, take 2 |
List | pgsql-hackers |
My $company recently acquired another postgres based $company and migrated all their server operations into our datacenter. Upon completing the move, the newly migrated database server started experiencing huge load spikes. *) Environment description: Postgres 9.2.4 RHEL 6 32 cores virtualized (ESX) but with a dedicated host 256GB ram shared_buffers: 32G 96 application servers configured to max 5 connections each very fast i/o database size: ~ 200GB HS/SR: 3 slaves *) Problem description: The server normally hums along nicely with load < 1.0 and no iowait -- in fact the server is massively over-provisioned. However, on semi-random basis (once every 1-2 days) load absolutely goes through the roof to 600+, no iowait, 90-100% (70%+ sys) cpu. It hangs around like that for 5-20 minutes then resolves as suddenly as it started. There is nothing interesting going on application side (except the application servers are all piling on) but pg_locks is recording lots of contention on relation 'extension locks'. One interesting point is that the slaves are also affected, but the precise point of the high load affects happens some seconds after the master. *) Initial steps taken: RhodiumToad aka (Andrew G) has seen this in the wild several times and suggested dropping shared_buffers significantly might resolve the situation short term. That was done on friday night, and so far problem has not re-occurred. *) What I think is happening: I think we are again getting burned by getting de-scheduled while holding the free list lock. I've been chasing this problem for a long time now (for example, see: http://postgresql.1045698.n5.nabble.com/High-SYS-CPU-need-advise-td5732045.html) but not I've got a reproducible case. What is happening this: 1. in RelationGetBufferForTuple (hio.c): fire LockRelationForExtension 2. call ReadBufferBI. this goes down the chain until StrategyGetBuffer() 3. Lock free list, go into clock sweep loop 4. while holding clock sweep, hit 'hot' buffer, spin on it 5. get de-scheduled 6. now enter the 'hot buffer spin lock lottery' 7. more/more backends pile on, linux scheduler goes bezerk, reducing chances of winning #6 8. finally win the lottery. lock released. everything back to normal. *) what I would like to do to fix it: see attached patch. This builds on the work of Jeff Janes to remove the free list lock and has some extra optimizations in the clock sweep loop: optimization 1: usage count is advisory. it is not updated behind the buffer lock. in the event there are a large sequences of buffers with >0 usage_count, this avoids spamming the cache_line lock; you decrement and hope for the best optimization 2: refcount is examined during buffer allocation without a lock. if it's > 0, buffer is assumed pinned (even though it may not in fact be) and sweep continues optimization 3: sweep does not wait on buf header lock. instead, it does 'try lock' and bails if the buffer is determined pinned. I believe this to be one of the two critical optimizations optimization 4: remove free list lock (via Jeff Janes). This is the other optimization: one backend will no longer be able to shut down buffer allocation *) what I'm asking for Is the analysis and the patch to fix the perceived problem plausible without breaking other stuff.. If so, I'm inclined to go further with this. This is not the only solution on the table for high buffer contention, but IMNSHO it should get a lot of points for being very localized. Maybe a reduced version could be tried retaining the freelist lock but keeping the 'trylock' on the buf header. *) further reading: https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CC8QFjAA&url=http%3A%2F%2Fpostgresql.1045698.n5.nabble.com%2FHigh-SYS-CPU-need-advise-td5732045.html&ei=hsb_Uc6pB4Ss9ASN7YHoAg&usg=AFQjCNEefMxOvjvW3Alg4TiXqCSAUmDR7A&sig2=EyPOQa9XbVEND5kwzTeBJg&bvm=bv.50165853,d.eWU http://www.postgresql.org/message-id/CAHyXU0x47D4n6EdPyNyadShXQQXKoheLV2cbRgr_2NGrC8KRRQ@mail.gmail.com http://postgresql.1045698.n5.nabble.com/Page-replacement-algorithm-in-buffer-cache-td5749236.html merlin
Attachment
pgsql-hackers by date: