Thread: deadlock

deadlock

From

John R Pierce

Date:

11 August 2010, 03:36:12

  We've got an app, I don't know all the details of the schema offhand,
but its using date partitioned tables, its heavily multithreaded and
processing continuous events...   Under load, production (overseas) is
getting a SQL deadlock...

    Process 20333: DROP table data_details_20100718
    Process 20333 waits for AccessExclusiveLock on relation 29609 of
    database 16384; blocked by process 20307.

    Process 20307: select * from data_daily where f1 =$1 and f2=$2 and
    f3=$3 and f4=$4 and ...
    Process 20307 waits for AccessShareLock on relation 28523 of
    database 16384; blocked by process 20333.



I'm -assuming- that the table being dropped is a partition of the other
table.   I've asked the developers (my coworkers) to confirm, and for
any details of how they are doing the partitions.


does anyone have any suggestions for what to look for, or what sort of
common partition management mistakes in the application could lead to
this sort of deadlock?

Re: deadlock

From

David Fetter

Date:

11 August 2010, 22:32:21

On Tue, Aug 10, 2010 at 11:35:48PM -0700, John R Pierce wrote:
>  We've got an app, I don't know all the details of the schema
> offhand, but its using date partitioned tables, its heavily
> multithreaded and processing continuous events...   Under load,
> production (overseas) is getting a SQL deadlock...
>
>    Process 20333: DROP table data_details_20100718
>    Process 20333 waits for AccessExclusiveLock on relation 29609 of
>    database 16384; blocked by process 20307.
>
>    Process 20307: select * from data_daily where f1 =$1 and f2=$2 and
>    f3=$3 and f4=$4 and ...
>    Process 20307 waits for AccessShareLock on relation 28523 of
>    database 16384; blocked by process 20333.
>
> I'm -assuming- that the table being dropped is a partition of the
> other table.   I've asked the developers (my coworkers) to confirm,
> and for any details of how they are doing the partitions.

That seems super likely, given its name and the fact that it's being
dropped.

> does anyone have any suggestions for what to look for, or what sort
> of common partition management mistakes in the application could
> lead to this sort of deadlock?

DDL is a "don't do it at peak load" event.  More realistically, it's
more like a "down time" event.  Maybe when we have "real"
partitioning...

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: deadlock

From

John R Pierce

Date:

12 August 2010, 02:39:24

  On 08/11/10 6:32 PM, David Fetter wrote:
>> does anyone have any suggestions for what to look for, or what sort
>> of common partition management mistakes in the application could
>> lead to this sort of deadlock?
> DDL is a "don't do it at peak load" event.  More realistically, it's
> more like a "down time" event.  Maybe when we have "real"
> partitioning...

Thats an *ouch* for this application, as the load is a 24/7 thing... the
database has to handle a steady stream of events and requests at a
fairly high rate coming from a factory operation, and we need to prune
weekly partitions while all the rest is going on.

The developers (and operations) are coming from an Oracle Enterprise
environment, where this all just works.   I've been pushing for years to
give postgres a try, this is the first substantial database deployment.