Home > mailing lists

Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: Problem with txid_snapshot_in/out() functionality) - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: Problem with txid_snapshot_in/out() functionality)
Date	May 6, 2014 11:45:52
Msg-id	20140506114414.GN17909@awork2.anarazel.de Whole thread Raw
In response to	Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: Problem with txid_snapshot_in/out() functionality) (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses	Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: Problem with txid_snapshot_in/out() functionality)
List	pgsql-hackers

Tree view

Hi,

On 2014-05-05 13:41:00 +0300, Heikki Linnakangas wrote:
> I came up with the attached fix for this. Currently, the entry is implicitly
> considered dead or unlocked if the locking_xid transaction is no longer
> active, but this patch essentially turns locking_xid into a simple boolean,
> and makes it the backend's responsibility to clear it on abort. (it's not
> actually a boolean, it's a BackendId, but that's just for debugging purposes
> to track who's keeping the entry locked). This requires a process exit hook,
> and an abort hook, to make sure the entry is always released, but that's not
> too difficult. It allows the backend to release the entry at exactly the
> right time, instead of having it implicitly released by

> I considered Andres' idea of using a new heavy-weight lock, but didn't like
> it much. It would be a larger patch, which is not nice for back-patching.
> One issue would be that if you run out of lock memory, you could not roll
> back any prepared transactions, which is not nice because it could be a
> prepared transaction that's hoarding the lock memory.

I am not convinced by the latter reasoning but you're right that any
such change would hardly be backpatchable.

> +/*
> + * Exit hook to unlock the global transaction entry we're working on.
> + */
> +static void
> +AtProcExit_Twophase(int code, Datum arg)
> +{
> +    /* same logic as abort */
> +    AtAbort_Twophase();
> +}
> +
> +/*
> + * Abort hook to unlock the global transaction entry we're working on.
> + */
> +void
> +AtAbort_Twophase(void)
> +{
> +    if (MyLockedGxact == NULL)
> +        return;
> +
> +    /*
> +     * If we were in process of preparing the transaction, but haven't
> +     * written the WAL record yet, remove the global transaction entry.
> +     * Same if we are in the process of finishing an already-prepared
> +     * transaction, and fail after having already written the WAL 2nd
> +     * phase commit or rollback record.
> +     *
> +     * After that it's too late to abort, so just unlock the GlobalTransaction
> +     * entry.  We might not have transfered all locks and other state to the
> +     * prepared transaction yet, so this is a bit bogus, but it's the best we
> +     * can do.
> +     */
> +    if (!MyLockedGxact->valid)
> +    {
> +        RemoveGXact(MyLockedGxact);
> +    }
> +    else
> +    {
> +        LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
> +
> +        MyLockedGxact->locking_backend = InvalidBackendId;
> +
> +        LWLockRelease(TwoPhaseStateLock);
> +    }
> +    MyLockedGxact = NULL;
> +}

Is it guaranteed that all paths have called LWLockReleaseAll()
before calling the proc exit hooks? Otherwise we might end up waiting
for ourselves...

>  /*
>   * MarkAsPreparing
> @@ -261,29 +329,15 @@ MarkAsPreparing(TransactionId xid, const char *gid,
>                   errmsg("prepared transactions are disabled"),
>                errhint("Set max_prepared_transactions to a nonzero value.")));
>  
> -    LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
> -
> -    /*
> -     * First, find and recycle any gxacts that failed during prepare. We do
> -     * this partly to ensure we don't mistakenly say their GIDs are still
> -     * reserved, and partly so we don't fail on out-of-slots unnecessarily.
> -     */
> -    for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
> +    /* on first call, register the exit hook */
> +    if (!twophaseExitRegistered)
>      {
> -        gxact = TwoPhaseState->prepXacts[i];
> -        if (!gxact->valid && !TransactionIdIsActive(gxact->locking_xid))
> -        {
> -            /* It's dead Jim ... remove from the active array */
> -            TwoPhaseState->numPrepXacts--;
> -            TwoPhaseState->prepXacts[i] = TwoPhaseState->prepXacts[TwoPhaseState->numPrepXacts];
> -            /* and put it back in the freelist */
> -            gxact->next = TwoPhaseState->freeGXacts;
> -            TwoPhaseState->freeGXacts = gxact;
> -            /* Back up index count too, so we don't miss scanning one */
> -            i--;
> -        }
> +        before_shmem_exit(AtProcExit_Twophase, 0);
> +        twophaseExitRegistered = true;
>      }

It's not particularly nice to register shmem exit hooks in the middle of
normal processing because it makes it impossible to use
cancel_before_shmem_exit() previously registered hooks. I think this
should be registered at startup, if max_prepared_xacts > 0.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

pgsql-hackers by date:

From: Michael Meskes
Date: 06 May 2014, 11:31:51
Subject: Re: using array of char pointers gives wrong results

From: Simon Riggs
Date: 06 May 2014, 11:45:54
Subject: Re: pg_shmem_allocations view

Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: Problem with txid_snapshot_in/out() functionality) - Mailing list pgsql-hackers

Previous

Next