Re: StrategyGetBuffer questions - Mailing list pgsql-hackers
From | Merlin Moncure |
---|---|
Subject | Re: StrategyGetBuffer questions |
Date | |
Msg-id | CAHyXU0wcd36SAJ7ChaKZym7_vPxN1z6qxBAzR7QP-WhW8VeM4A@mail.gmail.com Whole thread Raw |
In response to | Re: StrategyGetBuffer questions (Jeff Janes <jeff.janes@gmail.com>) |
Responses |
Re: StrategyGetBuffer questions
|
List | pgsql-hackers |
On Tue, Nov 20, 2012 at 4:50 PM, Jeff Janes <jeff.janes@gmail.com> wrote: > On Tue, Nov 20, 2012 at 1:26 PM, Merlin Moncure <mmoncure@gmail.com> wrote: >> In this sprawling thread on scaling issues [1], the topic meandered >> into StrategyGetBuffer() -- in particular the clock sweep loop. I'm >> wondering: >> >> *) If there shouldn't be a a bound in terms of how many candidate >> buffers you're allowed to skip for having a non-zero usage count. >> Whenever an unpinned usage_count>0 buffer is found, trycounter is >> reset (!) so that the code operates from point of view as it had just >> entered the loop. There is an implicit assumption that this is rare, >> but how rare is it? > > How often is that the trycounter would hit zero if it were not reset? > I've instrumented something like that in the past, and could only get > it to fire under pathologically small shared_buffers and workloads > that caused most of them to be pinned simultaneously. well, it's basically impossible -- and that's what I find odd. >> *) Shouldn't StrategyGetBuffer() bias down usage_count if it finds >> itself examining too many unpinned buffers per sweep? > > It is a self correcting problem. If it is examining a lot of unpinned > buffers, it is also decrementing a lot of them. sure. but it's entirely plausible that some backends are marking up usage_count rapidly and not allocating buffers while others are doing a lot of allocations. point being: all it takes is one backend to get scheduled out while holding the freelist lock to effectively freeze the database for many operations. it's been documented [1] that particular buffers can become spinlock contention hot spots due to reference counting of the pins. if a lot of allocation is happening concurrently it's only a matter of time before the clock sweep rolls around to one of them, hits the spinlock, and (in the worst case) schedules out. this could in turn shut down the clock sweep for some time and non allocating backends might then beat on established buffers and pumping up usage counts. The reference counting problem might be alleviated in some fashion for example via Robert's idea to disable reference counting under contention [2]. Even if you do that. you're still in for a world of hurt if you get scheduled out of a buffer allocation. Your patch fixes that AFAICT. The buffer pin check is outside the wider lock, making my suggestion to be less rigorous about usage_count a lot less useful (but perhaps not completely useless!). Another innovation might be to implement a 'trylock' variant of LockBufHdr that does a TAS but doesn't spin -- if someone else has the header locked, why bother waiting for it? just skip to the next and move on. .. merlin [1] http://archives.postgresql.org/pgsql-hackers/2012-05/msg01557.php [2] http://archives.postgresql.org/pgsql-hackers/2012-05/msg01571.php
pgsql-hackers by date: