Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Synchronizing slots from primary to standby
Date
Msg-id CALj2ACUADAcX3ptvBhbaS8bp=7ZZLvq27+v84w_RKCesc6X03w@mail.gmail.com
Whole thread Raw
In response to RE: Synchronizing slots from primary to standby  ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>)
Responses Re: Synchronizing slots from primary to standby
List pgsql-hackers
On Tue, Apr 2, 2024 at 7:25 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> > 1. Can we just remove pg_logical_replication_slot_advance and use
> > LogicalSlotAdvanceAndCheckSnapState instead? If worried about the
> > function naming, LogicalSlotAdvanceAndCheckSnapState can be renamed to
> > pg_logical_replication_slot_advance?
> >
> > + * Advance our logical replication slot forward. See
> > + * LogicalSlotAdvanceAndCheckSnapState for details.
> >   */
> >  static XLogRecPtr
> >  pg_logical_replication_slot_advance(XLogRecPtr moveto)
> >  {
>
> It was commented[1] that it's not appropriate for the
> pg_logical_replication_slot_advance to have an out parameter
> 'ready_for_decoding' which looks bit awkward as the functionality doesn't match
> the name, and is also not consistent with the style of
> pg_physical_replication_slot_advance(). So, we added a new function.

I disagree here. A new function just for a parameter is not that great
IMHO. I'd just rename LogicalSlotAdvanceAndCheckSnapState(XLogRecPtr
moveto, bool *found_consistent_snapshot) to
pg_logical_replication_slot_advance(XLogRecPtr moveto, bool
*found_consistent_snapshot) and use it. If others don't like this, I'd
at least turn pg_logical_replication_slot_advance(XLogRecPtr moveto) a
static inline function.

> > 5. As far as the test case for this issue is concerned, I'm fine with
> > adding one using an INJECTION point because we seem to be having no
> > consistent way to control postgres writing current snapshot to WAL.
>
> Since me and my colleagues can reproduce the issue consistently after applying
> 0002 and it could be rare for concurrent xl_running_xacts to happen, we discussed[2] to
> consider adding the INJECTION point after pushing the main fix.

Right.

> > 7.
> > +    /*
> > +     * We need to access the system tables during decoding to build the
> > +     * logical changes unless we are in fast-forward mode where no changes
> > are
> > +     * generated.
> > +     */
> > +    if (slot->data.database != MyDatabaseId && !fast_forward)
> >
> > May I know if we need this change for this fix?
>
> The slotsync worker needs to advance the slots from different databases in
> fast_forward. So, we need to skip this check in fast_forward mode. The analysis can
> be found in [3].
-    if (slot->data.database != MyDatabaseId)
+    /*
+     * We need to access the system tables during decoding to build the
+     * logical changes unless we are in fast_forward mode where no changes are
+     * generated.
+     */
+    if (slot->data.database != MyDatabaseId && !fast_forward)
         ereport(ERROR,

It's not clear from the comment that we need it for a slotsync worker
to advance the slots from different databases. Can this be put into
the comment? Also, specify in the comment, why this is safe?

Also, if this change is needed for only slotsync workers, why not
protect it with IsSyncingReplicationSlots()? Otherwise, it might
impact non-slotsync callers, no?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Greg Sabino Mullane
Date:
Subject: Re: On disable_cost
Next
From: Robert Haas
Date:
Subject: Re: psql not responding to SIGINT upon db reconnection