Thread: [GENERAL] Concurrency and UPDATE [...] FROM

[GENERAL] Concurrency and UPDATE [...] FROM

From
Seamus Abshere
Date:
The purpose is to concat new data onto existing values of c:

UPDATE tbl
SET c = c || new_data.c
FROM ( [...] ) AS new_data
WHERE
  tbl.id = new_data.id

It appears to have a race condition:

t0: Query A starts subquery
t1: Query A starts self-join
t2. Query A starts UPDATE with data from self-join and subquery
t3. Query B starts subquery
t4. Query B starts self-join (note: data from t1!)
[...]
tN. Query A finishes UPDATE
tN+1. Query B finishes UPDATE, missing any new_data from Query A

My assumption is that t1 and t4 (the self-joins) use SELECT but not
SELECT FOR UPDATE. If they did, I think the race condition would go
away.

Did I analyze that right?

Thanks!

--
Seamus Abshere, SCEA
https://github.com/seamusabshere
https://linkedin.com/in/seamusabshere
https://www.faraday.io


Re: [GENERAL] Concurrency and UPDATE [...] FROM

From
Thomas Munro
Date:
On Tue, Jul 11, 2017 at 6:23 AM, Seamus Abshere <seamus@abshere.net> wrote:
> The purpose is to concat new data onto existing values of c:
>
> UPDATE tbl
> SET c = c || new_data.c
> FROM ( [...] ) AS new_data
> WHERE
>   tbl.id = new_data.id
>
> It appears to have a race condition:
>
> t0: Query A starts subquery
> t1: Query A starts self-join
> t2. Query A starts UPDATE with data from self-join and subquery
> t3. Query B starts subquery
> t4. Query B starts self-join (note: data from t1!)
> [...]
> tN. Query A finishes UPDATE
> tN+1. Query B finishes UPDATE, missing any new_data from Query A
>
> My assumption is that t1 and t4 (the self-joins) use SELECT but not
> SELECT FOR UPDATE. If they did, I think the race condition would go
> away.
>
> Did I analyze that right?

Yeah, I think so.  There is no EvalPlanQual[1] behaviour on "new_data"
meaning that you can finish up self-joining versions of "tbl" from two
different times.  SELECT FOR UPDATE in "new_data" (as a subselect or
CTE etc) would activate that, or you could use SERIALIZABLE isolation
to abort transactions where the race would change the outcome, or some
other serialisation scheme like table or advisory locks.

[1] https://github.com/postgres/postgres/blob/master/src/backend/executor/README#L297

--
Thomas Munro
http://www.enterprisedb.com