Re: Conflict detection for update_deleted in logical replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Conflict detection for update_deleted in logical replication
Date
Msg-id CAA4eK1KPDsqkP+0WKwYZU=Cq8g86+Vq-yeqS0AS04k=GqEduVQ@mail.gmail.com
Whole thread Raw
In response to Re: Conflict detection for update_deleted in logical replication  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Thu, Jul 3, 2025 at 10:57 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Jul 3, 2025 at 10:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Jul 3, 2025 at 10:26 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > >
> > > You changes related to write barrier LGTM, however I have question
> > > regarding below change, IIUC, in logical replication
> > > MyReplicationSlot->effective_xmin should be the xmin value which has
> > > been flushed to the disk, but here we are just setting "data.xmin =
> > > new_xmin;" and marking the slot dirty so I believe its not been yet
> > > flushed to the disk right?
> > >
> >
> > Yes, because this is a physical slot and we need to follow
> > PhysicalConfirmReceivedLocation()/PhysicalReplicationSlotNewXmin().
> > The patch has kept a comment in advance_conflict_slot_xmin() as to why
> > it is okay not to flush the slot immediately.
>
> Oh right, I forgot its physical slot.  I think we are good, thanks.
>

BTW, I wanted to clarify one more point related to this part of the
patch. One difference between PhysicalReplicationSlotNewXmin() and
advance_conflict_slot_xmin() is that the former updates both
catalog_xmin and xmin for the slot, but later updates only the slot's
xmin. Can we see any reason to update both in our case? For example,
there is one case where the caller of
ProcArrayGetReplicationSlotXmin() expects that the catalog_xmin must
be set when required, but as far as I can see, it is required only
when logical slots are present, so we should be okay with that case.

The other case to consider is vacuum calling
GetOldestNonRemovableTransactionId() to get the cutoff xid to remove
deleted rows. This returns the xmin horizon based on the type of table
(user table, catalog table, etc.). Now, in this case,
ComputeXidHorizons() will first set data xmin for
catalog_oldest_nonremovable xid, and then if slot_catalog_xmin is
smaller, it uses that value. So, for this computation as well, setting
just slot's xmin in advance_conflict_slot_xmin() should be sufficient,
as we will anyway set both xmin and catalog_xmin to the same values.

By this theory, it doesn't matter whether we set catalog_xmin for
physical slots or not till we are setting the slot's xmin. IIUC,
catalog_xmin is required to be set for logical slots because during
logical decoding, we access only catalog tables, so we need to protect
those, and the catalog_xmin value is used for that.

Thoughts?

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Richard Guo
Date:
Subject: Re: Pathify RHS unique-ification for semijoin planning
Next
From: Nazir Bilal Yavuz
Date:
Subject: Re: Explicitly enable meson features in CI