Re: Wait free LW_SHARED acquisition - v0.2 - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Wait free LW_SHARED acquisition - v0.2 |
Date | |
Msg-id | 20140617102658.GR6763@awork2.anarazel.de Whole thread Raw |
In response to | Re: Wait free LW_SHARED acquisition - v0.2 (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Wait free LW_SHARED acquisition - v0.2
|
List | pgsql-hackers |
On 2014-06-17 12:41:26 +0530, Amit Kapila wrote: > On Fri, May 23, 2014 at 10:01 PM, Amit Kapila <amit.kapila16@gmail.com> > wrote: > > On Fri, Jan 31, 2014 at 3:24 PM, Andres Freund <andres@2ndquadrant.com> > wrote: > > > I've pushed a rebased version of the patchset to > > > http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git > > > branch rwlock contention. > > > 220b34331f77effdb46798ddd7cca0cffc1b2858 actually was the small problem, > > > ea9df812d8502fff74e7bc37d61bdc7d66d77a7f was the major PITA. > > > > As per discussion in developer meeting, I wanted to test shared > > buffer scaling patch with this branch. I am getting merge > > conflicts as per HEAD. Could you please get it resolved, so that > > I can get the data. > > I have started looking into this patch and have few questions/ > findings which are shared below: > > 1. I think stats for lwstats->ex_acquire_count will be counted twice, > first it is incremented in LWLockAcquireCommon() and then in > LWLockAttemptLock() Hrmpf. Will fix. > 2. > Handling of potentialy_spurious case seems to be pending > in LWLock functions like LWLockAcquireCommon(). > > LWLockAcquireCommon() > { > .. > /* add to the queue */ > LWLockQueueSelf(l, mode); > > /* we're now guaranteed to be woken up if necessary */ > mustwait = LWLockAttemptLock(l, mode, false, &potentially_spurious); > > } > > I think it can lead to some problems, example: > > Session -1 > 1. Acquire Exclusive LWlock > > Session -2 > 1. Acquire Shared LWlock > > 1a. Unconditionally incrementing shared count by session-2 > > Session -1 > 2. Release Exclusive lock > > Session -3 > 1. Acquire Exclusive LWlock > It will put itself to wait queue by seeing the lock count incremented > by Session-2 > > Session-2 > 1b. Decrement the shared count and add it to wait queue. > > Session-4 > 1. Acquire Exclusive lock > This session will get the exclusive lock, because even > though other lockers are waiting, lockcount is zero. > > Session-2 > 2. Try second time to take shared lock, it won't get > as session-4 already has an exclusive lock, so it will > start waiting > > Session-4 > 2. Release Exclusive lock > it will not wake the waiters because waiters have been added > before acquiring this lock. I don't understand this step here? When releasing the lock it'll notice that the waiters is <> 0 and acquire the spinlock which should protect against badness here? > 3. > LWLockAcquireCommon() > { > for (;;) > { > PGSemaphoreLock(&proc->sem, false); > if (!proc->lwWaiting) > .. > } > proc->lwWaiting is checked, updated without spinklock where > as previously it was done under spinlock, won't it be unsafe? It was previously checked/unset without a spinlock as well: /* * Awaken any waiters I removed from the queue. */ while (head != NULL) { LOG_LWDEBUG("LWLockRelease", T_NAME(l), T_ID(l), "releasewaiter"); proc = head; head = proc->lwWaitLink; proc->lwWaitLink = NULL; proc->lwWaiting = false; PGSemaphoreUnlock(&proc->sem); } I don't think there's dangers here, lwWaiting will only ever be manipulated by the the PGPROC's owner. As discussed elsewhere there needs to be a write barrier before the proc->lwWaiting = false, even in upstream code. > 4. > LWLockAcquireCommon() > { > .. > for (;;) > { > /* "false" means cannot accept cancel/die interrupt here. */ > PGSemaphoreLock(&proc->sem, false); > if (!proc->lwWaiting) > break; > extraWaits++; > } > lock->releaseOK = true; > } > > lock->releaseOK is updated/checked without spinklock where > as previously it was done under spinlock, won't it be unsafe? Hm. That's probably buggy. Good catch. Especially if you have a compiler that does byte manipulation by reading e.g. 4 bytes from a struct and then write the wider variable back... So the releaseOk bit needs to move into LWLockDequeueSelf(). Thanks for looking! Andres Freund --Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: