On Friday, November 7, 2025 2:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> On Thu, Nov 6, 2025 at 2:36 AM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> >
> > On Thu, Nov 6, 2025 at 12:03 PM Zhijie Hou (Fujitsu)
> > <houzj.fnst@fujitsu.com> wrote:
> > >
> > > On Thursday, October 30, 2025 7:01 AM Masahiko Sawada
> <sawada.mshk@gmail.com> wrote:
> > > >
> > > >
> > > > Also, I think it's worth considering the idea Robert shared before[1]:
> > > >
> > > > ---
> > > > But what about just surgically preventing that?
> > > > ProcArraySetReplicationSlotXmin() could refuse to retreat the values,
> > > > perhaps? If it computes an older value than what's there, it just does
> nothing?
> > > > ---
> > > >
> > > > We did a similar fix for confirmed_flush LSN by commit ad5eaf390c582,
> and it
> > > > sounds reasonable to me that ProcArraySetReplicationSlotXmin()
> refuses to
> > > > retreat the values.
> > >
> > > I reviewed the thread and think that we could not straightforwardly apply a
> > > similar strategy to prevent the retreat of xmin/catalog_xmin here. This is
> > > because we maintain a central value
> > > (replication_slot_xmin/replication_slot_catalog_xmin) in
> > > ProcArraySetReplicationSlotXmin, where the value is expected to decrease
> when
> > > certain slots are dropped or invalidated.
> > >
> >
> > Good point. This can happen when the last slot is invalidated or dropped.
>
> After the last slot is invalidated or dropped, both slot_xmin and
> slot_catalog_xmin values are set InvalidTransactionId. Then in this
> case, these values are ignored when computing the oldest safe decoding
> XID in GetOldestSafeDecodingTransactionId(), no? Or do you mean that
> there is a case where slot_xmin and slot_catalog_xmin retreat to a
> valid XID?
I think when replication_slot_xmin is invalid,
GetOldestSafeDecodingTransactionId would return nextXid, which can be greater
than the original snap.xmin if some transaction IDs have been assigned. After
reviewing the report [1], the bug appears reproducible when
replication_slot_xmin is set to InvalidTransactionId (specific reproduction
steps are detailed at [2]) as well. Therefore, if we adopt the approach to
prevent retreating these values, we need to somehow avoid resetting
replication_slot_xmin, but that seems conflict with the behavior of resetting
replication_slot_xmin when dropping the last slot.
[1] https://www.postgresql.org/message-id/CAD21AoDKJBB6p4X-%2B057Vz44Xyc-zDFbWJ%2Bg9FL6qAF5PC2iFg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1KDFeh%3DZbvSWPx%3Dir2QOXBxJbH0K8YqifDtG3xJENLR%2Bw%40mail.gmail.com
Best Regards,
Hou zj