Re: Conflict detection for update_deleted in logical replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Conflict detection for update_deleted in logical replication
Date
Msg-id CAA4eK1LQXzrp3LSe3rw+6MFz=SgQcJ9F6NpYvFTBEF4TyD1cGw@mail.gmail.com
Whole thread Raw
In response to RE: Conflict detection for update_deleted in logical replication  ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>)
List pgsql-hackers
On Mon, Aug 18, 2025 at 5:05 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> On Monday, August 18, 2025 2:32 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Mon, Aug 18, 2025 at 10:36 AM Amit Kapila <amit.kapila16@gmail.com>
> > wrote:
> > >
> > > > ---
> > > > Even if an apply worker disables retaining dead tuples due to
> > > > max_conflict_retention_duration, it enables again after the server
> > > > restarts.
> > > >
> > >
> > > I also find this behaviour questionable because this also means that
> > > it is possible that before restart one would deduce that the
> > > update_deleted conflict won't be reliably detected for a particular
> > > subscription but after restart it could lead to the opposite
> > > conclusion. But note that to make it behave similarly we need to store
> > > this value persistently in pg_subscription unless you have better
> > > ideas for this. Theoretically, there are two places where we can
> > > persist this information, one is with pg_subscription, and other in
> > > origin. I find it is closer to pg_subscription.
> >
> > I think it makes sense to store this in pg_subscription to preserve the decision
> > across restart.
>
> Thanks for sharing the opinion!
>
> Regarding this, I'd like to clarify some implementation details for persisting the
> retention status in pg_subscription.
>
> Since the logical launcher does not connect to a specific database, it cannot
> update the catalog, as this would trigger a FATAL error (e.g.,
> CatalogTupleUpdate -> ... -> ScanPgRelation -> FATAL: cannot read pg_class
> without having selected a database). Therefore, the apply worker should take
> responsibility for updating the catalog.
>
> To achieve that, ideally, the apply worker should update pg_subscription in a
> separate transaction, rather than using the transaction started during the
> application of changes. This implies that we must wait for the current
> transaction to complete before proceeding with the catalog update. So I think we
> could an additional phase, RDT_MARK_RETENTION_INACTIVE, to manage the
> catalog update once the existing transaction finishes.
>
> If we proceed in this manner, it suggests that the apply worker could set the
> shared memory flag first and then catalog flag. So, if the apply worker
> encounters an error after setting the shared memory flag but before updating the
> catalog, it may lead to issues similar to the one mentioned by Sawada-San,
> e.g., the apply worker restart but would retain the dead tuples again because
> the status had not persisted.

In this approach, why do we need to set the shared memory flag in the
first place, can't we rely on the catalog values? I understand there
is some delay when we detect to stop retention and when we actually
update the catalog but it shouldn't be big enough to matter for small
transactions because we will update it at the next transaction
boundary. For large transactions, we can always update it at the next
stream_stop message.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring
Next
From: Kirill Reshke
Date:
Subject: Re: Sequence Access Methods, round two