Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data - Mailing list pgsql-bugs
From | Noah Misch |
---|---|
Subject | Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data |
Date | |
Msg-id | 20210920044143.GA3414844@rfd.leadboat.com Whole thread Raw |
In response to | Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data (Noah Misch <noah@leadboat.com>) |
Responses |
Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data
|
List | pgsql-bugs |
On Mon, Aug 23, 2021 at 10:38:00PM +0500, Andrey Borodin wrote: > --- a/src/backend/access/transam/twophase.c > +++ b/src/backend/access/transam/twophase.c > @@ -459,14 +459,15 @@ MarkAsPreparingGuts(GlobalTransaction gxact, TransactionId xid, const char *gid, > proc->pgprocno = gxact->pgprocno; > SHMQueueElemInit(&(proc->links)); > proc->waitStatus = PROC_WAIT_STATUS_OK; > - /* We set up the gxact's VXID as InvalidBackendId/XID */ > - proc->lxid = (LocalTransactionId) xid; > + /* We set up the gxact's VXID as real for CIC purposes */ > + proc->lxid = MyProc->lxid; This breaks the case where the server restarted after PREPARE TRANSACTION. MyProc->lxid is 0 in the startup process, and LocalTransactionIdIsValid(0) is false. I'm attaching a test case addition. Can you repair this? > proc->xid = xid; > Assert(proc->xmin == InvalidTransactionId); > proc->delayChkpt = false; > proc->statusFlags = 0; > proc->pid = 0; > - proc->backendId = InvalidBackendId; > + /* May be backendId of startup process */ > + proc->backendId = MyBackendId; Incidentally, MyBackendId of startup process depends on other facts. When using hot standby, InitRecoveryTransactionEnvironment() sets MyBackendId=1. Otherwise, including clean startup of a non-standby node, MyBackendId is InvalidBackendId. This may be harmless. I didn't know about it. On Tue, Sep 07, 2021 at 11:45:15PM -0700, Noah Misch wrote: > On Sun, Aug 29, 2021 at 11:38:03PM +0500, Andrey Borodin wrote: > > > 29 авг. 2021 г., в 23:09, Andres Freund <andres@anarazel.de> написал(а): > > >>> It seems like it's going to add a substantial amount of work even when > > >>> no 2PC xacts are involved? > > >> Only if 2PCs are enabled. > > > > > > I don't think that's good enough. Plenty of systems have 2PC enabled but very > > > few if any transactions end up as 2PC ones. > > > Best optimisation I can imagine here is to gather all vxids with unknown xids and search for them in one call to TwoPhaseGetXidByVXid()with one LWLockAcquire(TwoPhaseStateLock, LW_SHARED). > > > > Does it worth the complexity? > > https://www.postgresql.org/search/?m=1&q=TwoPhaseStateLock&l=&d=-1&s=r > suggests this is the first postgresql.org discussion of TwoPhaseStateLock as a > bottleneck. Nonetheless, if Andres Freund finds it's worth the complexity, > then I'm content with it. I'd certainly expect some performance benefit. > Andres, what do you think? A few more benefits (beyond lock contention) come to mind: - Looking at the three VirtualXactLock() callers, waiting for final disposition of prepared transactions is necessary for WaitForLockersMultiple(), disadvantageous for WaitForOlderSnapshots(), and dead code for ResolveRecoveryConflictWithVirtualXIDs(). In WaitForOlderSnapshots(), PREPARE is as good as COMMIT/ABORT, because a prepared transaction won't do further database reads. Waiting on the prepared transaction there could give CIC an arbitrarily-long, needless delay. ResolveRecoveryConflictWithVirtualXIDs() will never wait on a prepared transaction, because prepared transactions hold no locks during recovery. (If a prepared transaction originally acquired AccessExclusiveLock, the startup process holds that lock on its behalf.) Coordinating the XID search at a higher layer would let us change WaitForLockersMultiple() without changing the others. - v13 WaitPreparedXact() experiences starvation when a steady stream of prepared transactions have the same VXID. Since VXID reuse entails reconnecting, starvation will be unnoticeable in systems that follow best practices around connection lifespan. The 2021-08-23 patch version didn't have that hazard. None of those benefits clearly justify the complexity, but they're relevant to the decision.
Attachment
pgsql-bugs by date: