Re: Logical Replication slot disappeared after promote Standby - Mailing list pgsql-hackers

From Perumal Raj
Subject Re: Logical Replication slot disappeared after promote Standby
Date
Msg-id CALvqh4rBzZncrtZ+wHX7VnYhSTc-O=pjqriHQhPDmDjc=Z9R8g@mail.gmail.com
Whole thread Raw
In response to Re: Logical Replication slot disappeared after promote Standby  (shveta malik <shveta.malik@gmail.com>)
List pgsql-hackers
Yes Shveta!

I could see repeated message in New-replica .

2025-06-13 06:20:30.146 UTC [277861] LOG:  could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
2025-06-13 06:20:30.146 UTC [277861] DETAIL:  The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
2025-06-13 06:21:00.176 UTC [277861] LOG:  could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
2025-06-13 06:21:00.176 UTC [277861] DETAIL:  The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
2025-06-13 06:21:30.207 UTC [277861] LOG:  could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
2025-06-13 06:21:30.207 UTC [277861] DETAIL:  The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
2025-06-13 06:22:00.238 UTC [277861] LOG:  could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
2025-06-13 06:22:00.238 UTC [277861] DETAIL:  The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
2025-06-13 06:22:30.268 UTC [277861] LOG:  could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
2025-06-13 06:22:30.268 UTC [277861] DETAIL:  The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
2025-06-13 06:23:00.299 UTC [277861] LOG:  could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
2025-06-13 06:23:00.299 UTC [277861] DETAIL:  The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
2025-06-13 06:23:30.329 UTC [277861] LOG:  could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
2025-06-13 06:23:30.329 UTC [277861] DETAIL:  The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
2025-06-13 06:24:00.360 UTC [277861] LOG:  could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
2025-06-13 06:24:00.360 UTC [277861] DETAIL:  The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
2025-06-13 06:24:30.391 UTC [277861] LOG:  could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
2025-06-13 06:24:30.391 UTC [277861] DETAIL:  The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
2025-06-13 06:25:00.421 UTC [277861] LOG:  could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
2025-06-13 06:25:00.421 UTC [277861] DETAIL:  The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
2025-06-13 06:25:30.452 UTC [277861] LOG:  could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
2025-06-13 06:25:30.452 UTC [277861] DETAIL:  The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.

It appears that my Debezium connectors have stopped consuming data, resulting in an outdated restart_lsn of "0/6D0000B8".

In contrast, the New_replica has a restart_lsn that matches the primary server's most recent confirmed_flush_lsn, indicating it is up to date.

As soon as I recreate that replication slot, it got sync with New_Replica(temporary=false) .

2025-06-13 06:26:00.484 UTC [277861] LOG:  dropped replication slot "kafka_logical_slot" of database with OID 16384

2025-06-13 06:26:30.520 UTC [277861] LOG:  starting logical decoding for slot "kafka_logical_slot"

2025-06-13 06:26:30.520 UTC [277861] DETAIL:  Streaming transactions committing after 0/0, reading WAL from 0/76003140.

2025-06-13 06:26:30.520 UTC [277861] LOG:  logical decoding found consistent point at 0/76003140

2025-06-13 06:26:30.520 UTC [277861] DETAIL:  There are no running transactions.

2025-06-13 06:26:30.526 UTC [277861] LOG:  newly created replication slot "kafka_logical_slot" is sync-ready now

2025-06-13 06:35:39.212 UTC [277857] LOG:  restartpoint starting: time

2025-06-13 06:35:42.022 UTC [277857] LOG:  restartpoint complete: wrote 29 buffers (0.2%); 0 WAL file(s) added, 0 removed, 0 recycled; write=2.805 s, sync=0.002 s, total=2.810 s; sync files=26, longest=0.002 s, average=0.001 s; distance=16496 kB, estimate=16496 kB; lsn=0/7701F480, redo lsn=0/7701F428

2025-06-13 06:35:42.022 UTC [277857] LOG:  recovery restart point at 0/7701F428

2025-06-13 06:35:42.022 UTC [277857] DETAIL:  Last completed transaction was at log time 2025-06-13 06:33:31.675341+00.

Until the synchronization is complete, the slot type is marked as temporary=true, as you mentioned.

is there any manual way to advance "restart_lsn"  of logical replication slot ? This is to ensure slot synchronization.
 
Thanks,

On Thu, Jun 12, 2025 at 8:49 PM shveta malik <shveta.malik@gmail.com> wrote:
On Fri, Jun 13, 2025 at 6:23 AM Perumal Raj <perucinci@gmail.com> wrote:
>
> Hi Hou zj
>
> I have found some strange issue , but not sure if I am doing anything wrong.
>
> I am able to see logical slot at STANDBY even after promote. 👏

Good to know.

>
> Importantly Logical replication slot is persistance in STANDBYs which already established connection with Primary before logical replication slot creation.
>
> But If I create any new replica(Direct to Primary) after logical replication slot creation, then its not persistance(temporary=true) .
>
>
> New Replica :
>    node   |     slot_name      | slot_type | temporary | active |  plugin  |   database    | failover | synced | restart_lsn | confirmed_flush_lsn |        inactive_since
> ----------+--------------------+-----------+-----------+--------+----------+---------------+----------+--------+-------------+---------------------+------------------------------
>  stand-by | kafka_logical_slot | logical   | t         | t      | pgoutput | replica_test | t        | t      | 0/6C000000  |                     | 2025-06-13 00:43:15.61492+00
>
> Old Replica ,
>    node   |     slot_name      | slot_type | temporary | active |  plugin  |   database    | failover | synced | restart_lsn | confirmed_flush_lsn |        inactive_since
> ----------+--------------------+-----------+-----------+--------+----------+---------------+----------+--------+-------------+---------------------+-------------------------------
>  stand-by | kafka_logical_slot | logical   | f         | f      | pgoutput | replica_test | t        | t      | 0/6D000060  | 0/6D000098          | 2025-06-13 00:45:11.547671+00
>
>
> Not sure if any Pre-Req missing in my test environment. Or any limitation .

It may be a possibility that the slot is not sync-ready yet  (and thus
not persisted) on new-replica due to primary having older values of
xmin and lsn.  We do not allow persisting a synced slot if the
required WAL or catalog rows for this slot have been removed or are at
risk of removal on standby. The slot will  be persisted in the next
few cycles of automatic slot-synchronization when it is ensured that
source slot's values are safe to be synced to the standby. But to
confirm my diagnosis, please provide this information:

1)
Output of this query on both primary and new-replica (where slot is temporary)
select slot_name, failover, synced, temporary, catalog_xmin,
restart_lsn, confirmed_flush_lsn from pg_replication_slots;

2)
Please check logs on new-replica to see the presence of log:
LOG: could not synchronize replication slot "kafka_logical_slot".

If found, please provide us with both the LOG and DETAIL messages
dumped in the log file.

thanks
Shveta

pgsql-hackers by date:

Previous
From: jian he
Date:
Subject: Re: ALTER TABLE ALTER CONSTRAINT misleading error message
Next
From: Daniel Gustafsson
Date:
Subject: Re: pg_dump --with-* options