Home > mailing lists

upgrades in row-level locks can deadlock - Mailing list pgsql-hackers

From	Oleksii Kliukin
Subject	upgrades in row-level locks can deadlock
Date	May 22, 2019 15:27:15
Msg-id	B9C9D7CD-EB94-4635-91B6-E558ACEC0EC3@hintbits.com Whole thread Raw
Responses	Re: upgrades in row-level locks can deadlock
List	pgsql-hackers

Tree view

Hello,

I have recently observed a deadlock on one of our production servers related
to locking only a single row in a job table. There were two functions involved
in the deadlock, the first one acquires a “for key share” lock on the row that
represents the job it works on and subsequently updates it with the job’s end
time (we need multiple jobs to be operating on a single row concurrently,
that’s why there is a "for key share" lock). The other function starts by
acquiring the “for update” lock on the job row and then performs actions that
should not be run in parallel with other jobs.

The deadlock can be easily reproduced with the following statements. The
queries run against a table job (id integer primary key, name text) with a
single row of (1,'a'))

X1: select id from job where name = 'a' for key share;
Y: select id from job where name = 'a' for update; -- starts waiting for X1
X2: select id from job where name = 'a' for key share;
X1: update job set name = 'b' where id = 1;
X2: update job set name = 'c' where id = 1; -- starts waiting for X1
X1: rollback;

At this point, Y is terminated by the deadlock detector:

"deadlock detected",
Process 53937 waits for ShareLock on transaction 488; blocked by process 53953.
Process 53953 waits for ExclusiveLock on tuple (0,1) of relation 16386 of database 12931;
blocked by process 53937.
Process 53937: select id from job where name = 'a' for update;
Process 53953: update job set name = 'c' where id = 1;",

The deadlock is between X2 and Y. Y waits for X2 to finish, as X2 holds a "key
share" lock, incompatible with "for update" that Y attempts to acquire. On the
other hand, X2 needs to acquire the row lock to perform the update, and that
is a two-phase process: first, get the tuple lock and then wait for
conflicting transactions to finish, releasing the tuple lock afterward. X2
tries to acquire the tuple lock, but it is owned by Y. PostgreSQL detects the
deadlock and terminates Y.

Such a deadlock only occurs when three or more sessions locking the same row
are present and the lock is upgraded in at least one session. With only two
sessions the upgrade does not go through the lock manager, as there are no
conflicts with locks stored in the tuple.

That gave me an idea on how to change PostgreSQL to avoid deadlocking under
the condition above. When detecting the lock upgrade from the multixact, we
can avoid acquiring the tuple lock; however, we should still wait for the
mutlixact members that hold conflicting locks, to avoid acquiring incompatible
ones.

The patch is attached. I had to tweak heap_update and heap_delete alongside
the heap_lock_tuple, as they acquire row locks as well. I am not very happy
with overloading DoesMultiXactIdConflict with another function to check if
current transaction id is among the multixact members, perhaps it is worth to
have a separate function for this. We can figure this out if we agree this is
the problem that needs to be solved and on the solution. The other possible
objection is related to the statement from README.tuplock that we need to go
through the lock manager to avoid starving waiting exclusive-lockers. Since
this patch omits the tuple lock only when the lock upgrade happens, it does
limit the starvation condition to the cases when the lock compatible with the
one the waiting process asks for is acquired first and then gets upgraded to
the incompatible one. Since under present conditions the same operation will
likely deadlock and cancel the exclusive waiter altogether, I don't see this
as a strong objection.

Cheers,
Oleksii

Attachment

deadlock_on_row_lock_upgrades_v1.diff

pgsql-hackers by date:

From: Andres Freund
Date: 22 May 2019, 15:26:14
Subject: Re: "long" type is not appropriate for counting tuples

From: Stas Kelvich
Date: 22 May 2019, 15:41:39
Subject: Re: Read-only access to temp tables for 2PC transactions

upgrades in row-level locks can deadlock - Mailing list pgsql-hackers

Attachment

Previous

Next