On Wed, Sep 10, 2025 at 11:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Sep 6, 2025 at 3:46 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached the updated patch that incorporated all comments I got so far.
> >
>
> *
> + /*
> + * While all processes are using the new status, there could be some
> + * transactions that might have started with the old status. So wait
> + * for the running transactions to complete so that logical decoding
> + * doesn't include transactions that wrote WAL with insufficient
> + * information.
> + */
> + running = GetRunningTransactionData();
> + LWLockRelease(ProcArrayLock);
> + LWLockRelease(XidGenLock);
> +
> + elog(DEBUG1, "waiting for %d transactions to complete", running->xcnt);
> +
> + for (int i = 0; i < running->xcnt; i++)
> + {
> + TransactionId xid = running->xids[i];
> +
> + if (TransactionIdIsCurrentTransactionId(xid))
> + continue;
> +
> + XactLockTableWait(xid, NULL, NULL, XLTW_None);
> + }
>
> When building a snapshot during the start of logical decoding, we
> anyway wait for running transactions to finish via the snapbuild
> machinery. So, why do we need it here? And if it is needed, can we
> update the comments to explain why it is required in spite of
> snapbuild machinery doing similar thing?
Fair point. I don't see any reason we need to wait here. Will remove this step.
> * Is it a good idea to enable/disable decoding for temporary logical
> slots? The temporary slots are released during ERROR or at session
> end, is that a good time to do the disable processing that even
> requires WAL writing.
I think the same is true for slots with RS_EPEMERAL state. Since it
could confuse users if automatic effective_wal_level change is
supported only for non-temporary slots, I personally would like not to
push aside temporary slots. I agree that it might not be a good time
to disable processing during process shutdown time; in addition to
requiring WAL record, it also requires waits for concurrent state
change processings while it holds all interrupts, which could easily
involve dead-locks. It might be worth considering doing the disable
process in a lazy way. For example, other processes (like
checkpointer) periodically checks the logical decoding status and
disables it if necessary.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com