XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code) - Mailing list pgsql-hackers
From | Jeremy Schneider |
---|---|
Subject | XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code) |
Date | |
Msg-id | 3476708e-7919-20cb-ca45-6603470565f7@amazon.com Whole thread Raw |
Responses |
Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)
|
List | pgsql-hackers |
While working with Nathan Bossart on an extension, we came across an interesting quirk and possible inconsistency in the PostgreSQL code around infomask flags. I'd like to know if there's something I'm misunderstanding here or if this really is a correctness/robustness gap in the code. At the root of it is the relationship between the XMAX_LOCK_ONLY and XMAX_COMMITTED infomask bits. One of the things in that all-important foreign key patch from 2013 (0ac5ad51) was to tweak the UpdateXmaxHintBits() function to always set the INVALID bit if the transaction was a locker only (even if the locking transaction committed). https://github.com/postgres/postgres/blob/9168793d7275b4b318c153d607fba55d14098c19/src/backend/access/heap/heapam.c#L1748 However it seems pretty clear from pretty much all of the visibility code that while it may not be the usual case, it is considered a valid state to have the XMAX_LOCK_ONLY and XMAX_COMMITTED bits set at the same time. This combination is handled correctly throughout heapam_visibility.c: https://github.com/postgres/postgres/blob/7559d8ebfa11d98728e816f6b655582ce41150f3/src/backend/access/heap/heapam_visibility.c#L273 https://github.com/postgres/postgres/blob/7559d8ebfa11d98728e816f6b655582ce41150f3/src/backend/access/heap/heapam_visibility.c#L606 https://github.com/postgres/postgres/blob/7559d8ebfa11d98728e816f6b655582ce41150f3/src/backend/access/heap/heapam_visibility.c#L871 https://github.com/postgres/postgres/blob/7559d8ebfa11d98728e816f6b655582ce41150f3/src/backend/access/heap/heapam_visibility.c#L1271 https://github.com/postgres/postgres/blob/7559d8ebfa11d98728e816f6b655582ce41150f3/src/backend/access/heap/heapam_visibility.c#L1447 But then there's one place in heapam.c where an assumption appears that this state will never happen - the compute_new_xmax_infomask() function: https://github.com/postgres/postgres/blob/9168793d7275b4b318c153d607fba55d14098c19/src/backend/access/heap/heapam.c#L4800 else if (old_infomask & HEAP_XMAX_COMMITTED) { ... if (old_infomask2 & HEAP_KEYS_UPDATED) status = MultiXactStatusUpdate; else status = MultiXactStatusNoKeyUpdate; new_status = get_mxact_status_for_lock(mode, is_update); ... new_xmax = MultiXactIdCreate(xmax, status, add_to_xmax, new_status); When that code sees XMAX_COMMITTED, it assumes the xmax can't possibly be LOCK_ONLY and it sets the status to something sufficiently high enough to guarantee that ISUPDATE_from_mxstatus() returns true. That means that when you try to update this tuple and compute_new_xmax_infomask() is called with an "update" status, two "update" statuses are sent to MultiXactIdCreate() which then fails (thank goodness) with the error "new multixact has more than one updating member". https://github.com/postgres/postgres/blob/cd5e82256de5895595cdd99ecb03aea15b346f71/src/backend/access/transam/multixact.c#L784 The UpdateXmaxHintBits() code to always set the INVALID bit wasn't in any patches on the mailing list but it was committed and it seems to have worked very well. As a result it seems nearly impossible to get into the state where you have both XMAX_LOCK_ONLY and XMAX_COMMITTED bits set; thus it seems we've avoided problems in compute_new_xmax_infomask(). But nonetheless it seems worth making the code more robust by having the compute_new_xmax_infomask() function handle this state correctly just as the visibility code does. It should only require a simple patch like this (credit to Nathan Bossart): diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index d881f4cd46..371e3e4f61 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -4695,7 +4695,9 @@ compute_new_xmax_infomask(TransactionId xmax, uint16 old_infomask, l5: new_infomask = 0; new_infomask2 = 0; - if (old_infomask & HEAP_XMAX_INVALID) + if (old_infomask & HEAP_XMAX_INVALID || + (old_infomask & HEAP_XMAX_COMMITTED && + HEAP_XMAX_IS_LOCKED_ONLY(old_infomask))) { /* * No previous locker; we just insert our own TransactionId. Alternatively, if we don't want to take this approach, then I'd argue that we should update README.tuplock to explicitly state that XMAX_LOCK_ONLY and XMAX_COMMITTED are incompatible (just as it already states for HEAP_XMAX_IS_MULTI and HEAP_XMAX_COMMITTED) and clean up the code in heapam_visibility.c for consistency. Might be worth adding a note to README.tuplock either way about valid/invalid combinations of infomask flags. Might help avoid some confusion as people approach the code base. What do others think about this? Thanks, Jeremy -- Jeremy Schneider Database Engineer Amazon Web Services
pgsql-hackers by date: