Thread: ERROR after writing PREPARE WAL record
Hello
Cancel/terminate requests are held off during "PREPARE TRANSACTION" processing in function PrepareTransaction(). However, a subroutine invoked by PrepareTransaction() may perform elog(ERROR) or elog(FATAL).
And if that happens after PREPARE WAL record is written and before transaction state is cleaned up, normal abort processing is triggered, i.e. AbortTransaction(). It is not correct to perform abort transaction workflow against a transaction that is already marked as prepared. A prepared transaction should only be finished using "COMMIT/ROLLBACK PREPARED" operation.
I tried injecting an elog(ERROR) at the end of EndPrepare() and that resulted in a PANIC at some point.
Before delving into more details, I want to ascertain that this is a valid problem to solve. Is the above problem worth worrying about?
Asim
Asim R P <apraveen@pivotal.io> writes: > Cancel/terminate requests are held off during "PREPARE TRANSACTION" > processing in function PrepareTransaction(). However, a subroutine invoked > by PrepareTransaction() may perform elog(ERROR) or elog(FATAL). Doing anything that's likely to fail in the post-commit code path is a Bad Idea (TM). There's no good recovery avenue, so the fact that you generally end up at a PANIC is expected/intentional. The correct response, if you notice code doing that, is to fix it so it doesn't do that. Typically the right answer is to move the failure-prone operation to pre-commit processing. regards, tom lane
On Wed, Jul 17, 2019 at 7:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Asim R P <apraveen@pivotal.io> writes:
> > Cancel/terminate requests are held off during "PREPARE TRANSACTION"
> > processing in function PrepareTransaction(). However, a subroutine invoked
> > by PrepareTransaction() may perform elog(ERROR) or elog(FATAL).
>
> The correct response, if you notice code doing that, is to fix it so
> it doesn't do that. Typically the right answer is to move the
> failure-prone operation to pre-commit processing.
Thank you for the response. There is nothing particularly alarming. There is one case in LWLockAcquire that may error out if (num_held_lwlocks >= MAX_SIMUL_LWLOCKS). This problem also exists in CommitTransaction() and AbortTransaction() code paths. Then there is arbitrary add-on code registered as Xact_callbacks.
SyncRepWaitForLSN() directly checks ProcDiePending and QueryCancelPending without going through CHECK_FOR_INTERRUPTS and that is for good reason. Moreover, it only emits a WARNING, so no problem there.
Asim
>
> Asim R P <apraveen@pivotal.io> writes:
> > Cancel/terminate requests are held off during "PREPARE TRANSACTION"
> > processing in function PrepareTransaction(). However, a subroutine invoked
> > by PrepareTransaction() may perform elog(ERROR) or elog(FATAL).
>
> The correct response, if you notice code doing that, is to fix it so
> it doesn't do that. Typically the right answer is to move the
> failure-prone operation to pre-commit processing.
Thank you for the response. There is nothing particularly alarming. There is one case in LWLockAcquire that may error out if (num_held_lwlocks >= MAX_SIMUL_LWLOCKS). This problem also exists in CommitTransaction() and AbortTransaction() code paths. Then there is arbitrary add-on code registered as Xact_callbacks.
SyncRepWaitForLSN() directly checks ProcDiePending and QueryCancelPending without going through CHECK_FOR_INTERRUPTS and that is for good reason. Moreover, it only emits a WARNING, so no problem there.
Asim