Re: group locking: incomplete patch, just for discussion - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: group locking: incomplete patch, just for discussion |
Date | |
Msg-id | CA+TgmoYGbeMmqEEVCEQi3fLk_1V089=jeKVEcQkkh5xuT2gGKw@mail.gmail.com Whole thread Raw |
In response to | Re: group locking: incomplete patch, just for discussion (Andres Freund <andres@2ndquadrant.com>) |
Responses |
Re: group locking: incomplete patch, just for discussion
|
List | pgsql-hackers |
On Fri, Oct 31, 2014 at 9:07 AM, Andres Freund <andres@2ndquadrant.com> wrote: > Maybe we can, as a first step, make those edges in the lock graph > visible to the deadlock detector? It's pretty clear that undetected > deadlocks aren't ok, but detectable deadlocks in a couple corner cases > might be acceptable. An interesting point came to mind this morning on this topic, while discussing this issue with Amit. Suppose process A1 holds a lock on some object and process A2, a member of the same locking group, waits for a lock on that same object. It follows that there must exist a process B which either *holds* or *awaits* a lock which conflicts with the one requested by A2; otherwise, A2 would have been granted the lock at once. If B *holds* the conflicting lock, it may eventually release it; but if B *awaits* the conflicting lock, we have a soft deadlock, because A2 waits for B waits for A1, which we presume to wait for A1.[1] The logical consequence of this is that, if a process seeks to acquire a lock which is already held (in any mode) by a fellow group-member, it had better jump over the entire wait queue and acquire the lock immediately if no conflicting locks have already been granted. If it fails to do this, it immediately creates a soft deadlock. What if there *are* conflicting locks already granted? In that case, things are more complicated. We could, for example, have this situation: A1 holds AccessShareLock, B holds ExclusiveLock, C1 awaits ShareLock, and then (following it in the wait queue) A2 awaits RowExclusiveLock. This is not a deadlock, because B can finish, then C can get the lock, then C1 can finish, then A2 can get the lock. But if C2 now waits for some member of group A, then we have a soft deadlock between group A and group C, which can be resolved by moving A2's request ahead of C1. (This example is relatively realistic, too: a lock upgrade from AccessShareLock to RowExclusiveLock is probably the most common type of lock upgrade, for reasons that are hopefully obvious.) In practice, I suspect it's best to take a more aggressive approach and have A2 jump straight to the head of the line right out of the chute. Letting some parallel workers make progress while holding up others out of a notion of locking fairness is probably stupid, because they're probably dependent on each other in some way that makes it useless for one to make progress while the others don't. For the same reason, I'd argue that we ought to wait until all pending lock requests from a given group can be satisfied before granting any of them. In fact, I suspect it's best in general for the locking machinery and the deadlock detector to view several simultaneous, pending lock requests as if they were a single request conflicting with everything that would conflict with any of the requested modes. Consider the situation where there are 10 lock waiters, 5 in each of two parallel groups. The deadlock detector could spend a good deal of time and energy trying various reorderings of the lock queue, but in practice it seems highly unlikely that it's sensible to do anything other than put all the locks from one group first and all of the locks from the other group afterwards, regardless of whether the processes want the same mode or different modes. Most of the other orderings will either be equivalent to one of those orderings (because commuting two non-conflicting locks in the wait queue has no effect) or soft deadlocks (because the interleave a conflicting lock request between two requests form the same group). And even if you can find some corner case where neither of those things is the case, it's probably still not useful to let half the lock group run and leave the other half blocked. What will probably happen is that some internal queue will pretty quickly either fill up or drain completely and the process filling it, or draining it, will wait. Now we're no better off than if we'd waited to grant the lock in the first place, and we're now holding more locks that can block other things or cause deadlocks. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company [1] Here I'm assuming, as in previous discussion, that we regard all members of a parallel group as mutually waiting for each other, on the theory that the parallel computation won't complete until all workers complete and, therefore, the members of the group are either already waiting for each other or eventually will. While this might not be true in every case, I believe it's a good (and generally valid) simplifying assumption.
pgsql-hackers by date: