Re: Global Deadlock Information - Mailing list pgsql-cluster-hackers
From | Markus Wanner |
---|---|
Subject | Re: Global Deadlock Information |
Date | |
Msg-id | 4B6DBFC6.6010507@bluegap.ch Whole thread Raw |
In response to | Re: Global Deadlock Information (Satoshi Nagayasu <satoshi.nagayasu@gmail.com>) |
Responses |
Re: Global Deadlock Information
|
List | pgsql-cluster-hackers |
Hi, I'm glad you are joining this discussion, thank you. Satoshi Nagayasu wrote: > Some developers (including me!) proposed the lock_timeout > GUC option. > > http://archives.postgresql.org/pgsql-hackers/2004-06/msg00935.php > http://archives.postgresql.org/pgsql-hackers/2010-01/msg01167.php Thanks for these pointers. > I still believe the "lock timeout" feature could help > resolving a global deadlock in the cluster environment. Well, you'd always need to find a compromise between waiting long enough to not kill transactions just because of high contention, but still react promptly enough to to resolve real deadlocks. I'd like to avoid such nifty configuration and tuning settings. > (2) Use the global wait-for graph to detect a global deadlock. Can you please elaborate on the replication solution that needs such a global wait-for graph? Why do you need a global graph, if you replicate all of your transaction anyway? Does that global graph imply a global abort decision as well? IMO a local wait-for graph is absolutely sufficient. The problem is just that different nodes might reach different decisions on how to resolve the deadlock. But if you replicate to all nodes, they will all be able to "see" the deadlock, no? > http://en.wikipedia.org/wiki/Deadlock#Distributed_deadlock That very article states: "In a Commitment ordering based distributed environment (including the Strong strict two-phase locking (SS2PL, or rigorous) special case) distributed deadlocks are resolved automatically by the atomic commitment protocol (e.g. two-phase commit (2PC)), and no global wait-for graph or other resolution mechanism are needed." And the issue with "phantom deadlocks" doesn't really excite me either, so I'd rather like not having to deal with such things. > I don't think the callback function is needed to replace > the current deadlock resolution feature, Obviously this wish list item needs more discussion. It seems we want two rather different things, then. How does your replication solution cope with the current deadlock resolver? How do you prevent it aborting > but I agree we need a consensus how we could avoid > the global deadlock situation in the cluster. How do you get to the situation where you have a global deadlock, but not a local one? That seems to imply that you are not replicating locks to all nodes. How do you think Postgres core could help with determining such global deadlocks? That seems more like a solution-specific thing to me. Are we even talking about the same level of locking, namely regular, heavy-weight locks (as per the storage/lmgr/README)? Kind Regards Markus Wanner
pgsql-cluster-hackers by date: