Re: [HACKERS] Restrict concurrent update/delete with UPDATE ofpartition key - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: [HACKERS] Restrict concurrent update/delete with UPDATE ofpartition key |
Date | |
Msg-id | 20180305232353.gpue7jldnm4bjf4i@alap3.anarazel.de Whole thread Raw |
In response to | Re: [HACKERS] Restrict concurrent update/delete with UPDATE ofpartition key (amul sul <sulamul@gmail.com>) |
Responses |
Re: [HACKERS] Restrict concurrent update/delete with UPDATE ofpartition key
Re: [HACKERS] Restrict concurrent update/delete with UPDATE ofpartition key |
List | pgsql-hackers |
Hi, On 2018-02-13 12:41:26 +0530, amul sul wrote: > From 08c8c7ece7d9411e70a780dbeed89d81419db6b6 Mon Sep 17 00:00:00 2001 > From: Amul Sul <sulamul@gmail.com> > Date: Tue, 13 Feb 2018 12:37:52 +0530 > Subject: [PATCH 1/2] Invalidate ip_blkid v5 > > v5: > - Added code in heap_mask to skip wal_consistency_checking[7] > - Fixed previous todos. > > v5-wip2: > - Minor changes in RelationFindReplTupleByIndex() and > RelationFindReplTupleSeq() > > - TODO; > Same as the privious > > v5-wip: Update w.r.t Amit Kapila's comments[6]. > - Reverted error message in nodeModifyTable.c from 'tuple to be locked' > to 'tuple to be updated'. > > - TODO: > 1. Yet to made a decision of having LOG/ELOG/ASSERT in the > RelationFindReplTupleByIndex() and RelationFindReplTupleSeq(). > > v4: Rebased on "UPDATE of partition key v35" patch[5]. > > v3: Update w.r.t Amit Kapila's[3] & Alvaro Herrera[4] comments > - typo in all error message and comment : "to an another" -> "to another" > - error message change : "tuple to be updated" -> "tuple to be locked" > - In ExecOnConflictUpdate(), error report converted into assert & > comments added. > > v2: Updated w.r.t Robert review comments[2] > - Updated couple of comment of heap_delete argument and ItemPointerData > - Added same concurrent update error logic in ExecOnConflictUpdate, > RelationFindReplTupleByIndex and RelationFindReplTupleSeq > > v1: Initial version -- as per Amit Kapila's suggestions[1] > - When tuple is being moved to another partition then ip_blkid in the > tuple header mark to InvalidBlockNumber. Very nice and instructive to keep this in a submission's commit message. > diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c > index 8a846e7dba..f4560ee9cb 100644 > --- a/src/backend/access/heap/heapam.c > +++ b/src/backend/access/heap/heapam.c > @@ -3037,6 +3037,8 @@ xmax_infomask_changed(uint16 new_infomask, uint16 old_infomask) > * crosscheck - if not InvalidSnapshot, also check tuple against this > * wait - true if should wait for any conflicting update to commit/abort > * hufd - output parameter, filled in failure cases (see below) > + * row_moved - true iff the tuple is being moved to another partition > + * table due to an update of partition key. Otherwise, false. > * I don't think 'row_moved' is a good variable name for this. Moving a row in our heap format can mean a lot of things. Maybe 'to_other_part' or 'changing_part'? > + /* > + * Sets a block identifier to the InvalidBlockNumber to indicate such an > + * update being moved tuple to another partition. > + */ > + if (row_moved) > + BlockIdSet(&((tp.t_data->t_ctid).ip_blkid), InvalidBlockNumber); The parens here are set in a bit werid way. I assume that's from copying it out of ItemPointerSet()? Why aren't you just using ItemPointerSetBlockNumber()? I think it'd be better if we followed the example of specultive inserts and created an equivalent of HeapTupleHeaderSetSpeculativeToken. That'd be a heck of a lot easier to grep for... > @@ -9314,6 +9323,18 @@ heap_mask(char *pagedata, BlockNumber blkno) > */ > if (HeapTupleHeaderIsSpeculative(page_htup)) > ItemPointerSet(&page_htup->t_ctid, blkno, off); > + > + /* > + * For a deleted tuple, a block identifier is set to the I think this 'the' is superflous. > + * InvalidBlockNumber to indicate that the tuple has been moved to > + * another partition due to an update of partition key. But I think it should be 'the partition key'. > + * Like speculative tuple, to ignore any inconsistency set block > + * identifier to current block number. This doesn't quite parse. > + */ > + if (!BlockNumberIsValid( > + BlockIdGetBlockNumber((&((page_htup->t_ctid).ip_blkid))))) > + BlockIdSet(&((page_htup->t_ctid).ip_blkid), blkno); > } That formatting looks wrong. I think it should be replaced by a macro like mentioned above. > /* > diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c > index 160d941c00..a770531e14 100644 > --- a/src/backend/commands/trigger.c > +++ b/src/backend/commands/trigger.c > @@ -3071,6 +3071,11 @@ ltrmark:; > ereport(ERROR, > (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE), > errmsg("could not serialize access due to concurrent update"))); > + if (!BlockNumberIsValid(BlockIdGetBlockNumber(&((hufd.ctid).ip_blkid)))) > + ereport(ERROR, > + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > + errmsg("tuple to be locked was already moved to another partition due to concurrent update"))); Yes, given that we repeat this in multiple places, I *definitely* want to see this wrapped in a macro with a descriptive name. > diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c > index 7961b4be6a..b07b7092de 100644 > --- a/src/backend/executor/nodeLockRows.c > +++ b/src/backend/executor/nodeLockRows.c > @@ -218,6 +218,11 @@ lnext: > ereport(ERROR, > (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE), > errmsg("could not serialize access due to concurrent update"))); > + if (!BlockNumberIsValid(BlockIdGetBlockNumber(&((hufd.ctid).ip_blkid)))) > + ereport(ERROR, > + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > + errmsg("tuple to be locked was already moved to another partition due to concurrent update"))); > + Why are we using ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE rather than ERRCODE_T_R_SERIALIZATION_FAILURE? A lot of frameworks have builtin logic to retry serialization failures, and this kind of thing is going to resolved by retrying, no? > diff --git a/src/test/isolation/expected/partition-key-update-1.out b/src/test/isolation/expected/partition-key-update-1.out > new file mode 100644 I'd like to see tests that show various interactions with ON CONFLICT. Greetings, Andres Freund
pgsql-hackers by date: