Re: Clear logical slot's 'synced' flag on promotion of standby - Mailing list pgsql-hackers

From shveta malik
Subject Re: Clear logical slot's 'synced' flag on promotion of standby
Date
Msg-id CAJpy0uCqDM_AX3mL38PotB4M2ahoPYCfYeH3pT0kbYXsQ9ga4w@mail.gmail.com
Whole thread Raw
In response to Re: Clear logical slot's 'synced' flag on promotion of standby  (Ashutosh Sharma <ashu.coek88@gmail.com>)
Responses Re: Clear logical slot's 'synced' flag on promotion of standby
Re: Clear logical slot's 'synced' flag on promotion of standby
List pgsql-hackers
On Tue, Sep 9, 2025 at 2:19 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> Hi,
>
>
> + * required resources. Clear any leftover 'synced' flags on replication
> + * slots when in crash recovery on the primary. The DB_IN_CRASH_RECOVERY
> + * state check ensures that this code is only reached when a standby
> + * server crashes during promotion.
>   */
>   StartupReplicationSlots();
> + if (ControlFile->state == DB_IN_CRASH_RECOVERY)
>
> I believe the primary server can also enter the DB_IN_CRASH_RECOVERY
> state. For example, if the primary is already in crash recovery and
> crashes again while in crash recovery, it will restart in the
> DB_IN_CRASH_RECOVERY state, no?
>

Yes, good point. I think we can differentiate the two cases based on
the timeline change. A regular primary won't have a timeline change,
whereas a promoted standby that failed during promotion will show a
timeline change immediately upon restart. Thoughts?

In the worst-case scenario, even if we end up running the Reset
function during a regular primary's crash recovery, it shouldn't cause
any harm. (That said, I'm not suggesting we shouldn't fix it).  What
concerns me more is the possibility of running it on a regular
standby, as it could disrupt slot synchronization. I attempted to
simulate a scenario where a regular standby ends up in
DB_IN_CRASH_RECOVERY after a crash, but I couldn't reproduce it. Do
you know of any situation where this could happen? The absence of
comments for these states makes it challenging to follow the flow.

> --
>
> With this change are we saying that on primary the synced flag must be
> always false. Because the postgres doc on pg_replication_slots says:
>
> "The value of this column has no meaning on the primary server; the
> column value on the primary is default false for all slots but may (if
> leftover from a promoted standby) also be true."
>

The doc needs change.

thanks
Shveta



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: issue with synchronized_standby_slots
Next
From: Amit Kapila
Date:
Subject: Re: issue with synchronized_standby_slots