Thread: deadlock
We've got an app, I don't know all the details of the schema offhand, but its using date partitioned tables, its heavily multithreaded and processing continuous events... Under load, production (overseas) is getting a SQL deadlock... Process 20333: DROP table data_details_20100718 Process 20333 waits for AccessExclusiveLock on relation 29609 of database 16384; blocked by process 20307. Process 20307: select * from data_daily where f1 =$1 and f2=$2 and f3=$3 and f4=$4 and ... Process 20307 waits for AccessShareLock on relation 28523 of database 16384; blocked by process 20333. I'm -assuming- that the table being dropped is a partition of the other table. I've asked the developers (my coworkers) to confirm, and for any details of how they are doing the partitions. does anyone have any suggestions for what to look for, or what sort of common partition management mistakes in the application could lead to this sort of deadlock?
On Tue, Aug 10, 2010 at 11:35:48PM -0700, John R Pierce wrote: > We've got an app, I don't know all the details of the schema > offhand, but its using date partitioned tables, its heavily > multithreaded and processing continuous events... Under load, > production (overseas) is getting a SQL deadlock... > > Process 20333: DROP table data_details_20100718 > Process 20333 waits for AccessExclusiveLock on relation 29609 of > database 16384; blocked by process 20307. > > Process 20307: select * from data_daily where f1 =$1 and f2=$2 and > f3=$3 and f4=$4 and ... > Process 20307 waits for AccessShareLock on relation 28523 of > database 16384; blocked by process 20333. > > I'm -assuming- that the table being dropped is a partition of the > other table. I've asked the developers (my coworkers) to confirm, > and for any details of how they are doing the partitions. That seems super likely, given its name and the fact that it's being dropped. > does anyone have any suggestions for what to look for, or what sort > of common partition management mistakes in the application could > lead to this sort of deadlock? DDL is a "don't do it at peak load" event. More realistically, it's more like a "down time" event. Maybe when we have "real" partitioning... Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On 08/11/10 6:32 PM, David Fetter wrote: >> does anyone have any suggestions for what to look for, or what sort >> of common partition management mistakes in the application could >> lead to this sort of deadlock? > DDL is a "don't do it at peak load" event. More realistically, it's > more like a "down time" event. Maybe when we have "real" > partitioning... Thats an *ouch* for this application, as the load is a 24/7 thing... the database has to handle a steady stream of events and requests at a fairly high rate coming from a factory operation, and we need to prune weekly partitions while all the rest is going on. The developers (and operations) are coming from an Oracle Enterprise environment, where this all just works. I've been pushing for years to give postgres a try, this is the first substantial database deployment.