Re: Move PinBuffer and UnpinBuffer to atomics - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Move PinBuffer and UnpinBuffer to atomics |
Date | |
Msg-id | 20150911161421.GB4996@alap3.anarazel.de Whole thread Raw |
In response to | Move PinBuffer and UnpinBuffer to atomics (YUriy Zhuravlev <u.zhuravlev@postgrespro.ru>) |
Responses |
Re: Move PinBuffer and UnpinBuffer to atomics
|
List | pgsql-hackers |
Hi, On 2015-09-11 13:23:24 +0300, YUriy Zhuravlev wrote: > Continuing the theme: http://www.postgresql.org/message-id/3368228.mTSz6V0Jsq@dinodell Please don't just start new threads for a new version of the patch. > This time, we fairly rewrote 'refcount' and 'usage_count' to atomic in > PinBuffer and UnpinBuffer (but save lock for buffer flags in Unpin). Hm. > In the same time it doesn't affect to correctness of buffer manager > because that variables already have LWLock on top of them (for partition of > hashtable). Note that there's a pending patch that removes the buffer mapping locks entirely. > If someone pinned buffer after the call StrategyGetBuffer we just try > again (in BufferAlloc). Also in the code there is one more check > before deleting the old buffer, where changes can be rolled back. The > other functions where it is checked 'refcount' and 'usage_count' put > exclusive locks. I don't think this is correct. This way we can leave the for (;;) loop in BufferAlloc() thinking that the buffer is unused (and can't be further pinned because of the held spinlock!) while it actually has been pinned since by PinBuffer(). Additionally oldFlags can get out of sync there. I don't think the approach of making some of the fields atomics but not really caring about the rest is going to work. My suggestion is to add a single 'state' 32bit atomic. This 32bit state is subdivided into: 10bit for flags, 3bit for usage_count, 16bit for refcount then turn each operation that currently uses one of these fields into corresponding accesses (just different values for flags, bit-shiftery & mask for reading usage count, bit mask for reading refcount). The trick then is to add a *new* flag value BM_LOCKED. This can then act as a sort of a 'one bit' spinlock. That should roughly look like (more or less pseudocode): void LockBufHdr(BufferDesc *desc) { int state = pg_atomic_read_u32(&desc->state); for (;;) { /* wait till lock is free */ while (unlikely(state & BM_LOCKED)) { pg_spin_delay(); state = pg_atomic_read_u32(&desc->state); /* add exponential backoff? Should seldomly be contended tho. */ } /* and try to get lock */ if (pg_atomic_compare_exchange_u32(&desc->state, &state, state | BM_LOCKED)) break; } } static bool PinBuffer(volatile BufferDesc *buf, BufferAccessStrategy strategy) { ... if (ref == NULL) { ReservePrivateRefCountEntry(); ref = NewPrivateRefCountEntry(b); ... int state = pg_atomic_read_u32(&desc->state); int oldstate = state; while (true) { /* spin-wait till lock is free */ while (unlikely(state & BM_LOCKED)) { pg_spin_delay(); state = pg_atomic_read_u32(&desc->state); } /* increase refcount */ state += 1; /* increase usagecount unless already max */ if ((state & USAGE_COUNT_MASK) != BM_MAX_USAGE_COUNT) state += BM_USAGE_COUNT_ONE; result = (state & BM_VALID) != 0; if (pg_atomic_compare_exchange_u32(&desc->state, &oldstate, state)) break; /* get ready for next loop, oldstate has been updated by cas */ state = oldstate; } ... } other callsites can either just plainly continue to use LockBufHdr/UnlockBufHdr or converted similarly to PinBuffer(). Greetings, Andres Freund
pgsql-hackers by date: