Re: out-of-order XID insertion in KnownAssignedXids - Mailing list pgsql-hackers
From | Konstantin Knizhnik |
---|---|
Subject | Re: out-of-order XID insertion in KnownAssignedXids |
Date | |
Msg-id | fc51532c-dbf5-dce0-b31c-82c7a0b837ed@postgrespro.ru Whole thread Raw |
In response to | Re: out-of-order XID insertion in KnownAssignedXids (Andres Freund <andres@anarazel.de>) |
Responses |
Re: out-of-order XID insertion in KnownAssignedXids
Re: out-of-order XID insertion in KnownAssignedXids |
List | pgsql-hackers |
On 08.10.2018 18:24, Andres Freund wrote: > > On October 8, 2018 2:04:28 AM PDT, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: >> >> On 05.10.2018 11:04, Michael Paquier wrote: >>> On Fri, Oct 05, 2018 at 10:06:45AM +0300, Konstantin Knizhnik wrote: >>>> As you can notice, XID 2004495308 is encountered twice which cause >> error in >>>> KnownAssignedXidsAdd: >>>> >>>> if (head > tail && >>>> TransactionIdFollowsOrEquals(KnownAssignedXids[head - 1], >> from_xid)) >>>> { >>>> KnownAssignedXidsDisplay(LOG); >>>> elog(ERROR, "out-of-order XID insertion in >> KnownAssignedXids"); >>>> } >>>> >>>> The probability of this error is very small but it can quite easily >>>> reproduced: you should just set breakpoint in debugger after calling >>>> MarkAsPrepared in twophase.c and then try to prepare any >> transaction. >>>> MarkAsPrepared will add GXACT to proc array and at this moment >> there will >>>> be two entries in procarray with the same XID: >>>> >>>> [snip] >>>> >>>> Now generated RUNNING_XACTS record contains duplicated XIDs. >>> So, I have been doing exactly that, and if you trigger a manual >>> checkpoint then things happen quite correctly if you let the first >>> session finish: >>> rmgr: Standby len (rec/tot): 58/ 58, tx: 0, lsn: >>> 0/016150F8, prev 0/01615088, desc: RUNNING_XACTS nextXid 608 >>> latestCompletedXid 605 oldestRunningXid 606; 2 xacts: 607 606 >>> >>> If you still maintain the debugger after calling MarkAsPrepared, then >>> the manual checkpoint would block. Now if you actually keep the >>> debugger, and wait for a checkpoint timeout to happen, then I can see >>> the incorrect record. It is impressive that your customer has been >> able >>> to see that first, and then that you have been able to get into that >>> state with simple steps. >>> >>>> I want to ask opinion of community about the best way of fixing this >>>> problem. Should we avoid storing duplicated XIDs in procarray (by >>>> invalidating XID in original pgaxct) or eliminate/change check for >>>> duplicate in KnownAssignedXidsAdd (for example just ignore >>>> duplicates)? >>> Hmmmmm... Please let me think through that first. It seems to me >> that >>> the record should not be generated to begin with. At least I am able >> to >>> confirm what you see. >> The simplest way to fix the problem is to ignore duplicates before >> adding them to KnownAssignedXids. >> We in any case perform sort i this place... > I vehemently object to that as the proper course. And what about adding qsort to GetRunningTransactionData or LogCurrentRunningXacts and excluding duplicates here? > Andres -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
pgsql-hackers by date: