Re: Logical Replication slot disappeared after promote Standby - Mailing list pgsql-hackers
From | Perumal Raj |
---|---|
Subject | Re: Logical Replication slot disappeared after promote Standby |
Date | |
Msg-id | CALvqh4rBzZncrtZ+wHX7VnYhSTc-O=pjqriHQhPDmDjc=Z9R8g@mail.gmail.com Whole thread Raw |
In response to | Re: Logical Replication slot disappeared after promote Standby (shveta malik <shveta.malik@gmail.com>) |
List | pgsql-hackers |
2025-06-13 06:20:30.146 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot2025-06-13 06:20:30.146 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.2025-06-13 06:21:00.176 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot2025-06-13 06:21:00.176 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.2025-06-13 06:21:30.207 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot2025-06-13 06:21:30.207 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.2025-06-13 06:22:00.238 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot2025-06-13 06:22:00.238 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.2025-06-13 06:22:30.268 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot2025-06-13 06:22:30.268 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.2025-06-13 06:23:00.299 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot2025-06-13 06:23:00.299 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.2025-06-13 06:23:30.329 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot2025-06-13 06:23:30.329 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.2025-06-13 06:24:00.360 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot2025-06-13 06:24:00.360 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.2025-06-13 06:24:30.391 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot2025-06-13 06:24:30.391 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.2025-06-13 06:25:00.421 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot2025-06-13 06:25:00.421 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.2025-06-13 06:25:30.452 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot2025-06-13 06:25:30.452 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
It appears that my Debezium connectors have stopped consuming data, resulting in an outdated restart_lsn of "0/6D0000B8".
In contrast, the New_replica has a restart_lsn that matches the primary server's most recent confirmed_flush_lsn, indicating it is up to date.
As soon as I recreate that replication slot, it got sync with New_Replica(temporary=false) .
2025-06-13 06:26:00.484 UTC [277861] LOG: dropped replication slot "kafka_logical_slot" of database with OID 16384
2025-06-13 06:26:30.520 UTC [277861] LOG: starting logical decoding for slot "kafka_logical_slot"
2025-06-13 06:26:30.520 UTC [277861] DETAIL: Streaming transactions committing after 0/0, reading WAL from 0/76003140.
2025-06-13 06:26:30.520 UTC [277861] LOG: logical decoding found consistent point at 0/76003140
2025-06-13 06:26:30.520 UTC [277861] DETAIL: There are no running transactions.
2025-06-13 06:26:30.526 UTC [277861] LOG: newly created replication slot "kafka_logical_slot" is sync-ready now
2025-06-13 06:35:39.212 UTC [277857] LOG: restartpoint starting: time
2025-06-13 06:35:42.022 UTC [277857] LOG: restartpoint complete: wrote 29 buffers (0.2%); 0 WAL file(s) added, 0 removed, 0 recycled; write=2.805 s, sync=0.002 s, total=2.810 s; sync files=26, longest=0.002 s, average=0.001 s; distance=16496 kB, estimate=16496 kB; lsn=0/7701F480, redo lsn=0/7701F428
2025-06-13 06:35:42.022 UTC [277857] LOG: recovery restart point at 0/7701F428
2025-06-13 06:35:42.022 UTC [277857] DETAIL: Last completed transaction was at log time 2025-06-13 06:33:31.675341+00.
Until the synchronization is complete, the slot type is marked as temporary=true, as you mentioned.
On Fri, Jun 13, 2025 at 6:23 AM Perumal Raj <perucinci@gmail.com> wrote:
>
> Hi Hou zj
>
> I have found some strange issue , but not sure if I am doing anything wrong.
>
> I am able to see logical slot at STANDBY even after promote. 👏
Good to know.
>
> Importantly Logical replication slot is persistance in STANDBYs which already established connection with Primary before logical replication slot creation.
>
> But If I create any new replica(Direct to Primary) after logical replication slot creation, then its not persistance(temporary=true) .
>
>
> New Replica :
> node | slot_name | slot_type | temporary | active | plugin | database | failover | synced | restart_lsn | confirmed_flush_lsn | inactive_since
> ----------+--------------------+-----------+-----------+--------+----------+---------------+----------+--------+-------------+---------------------+------------------------------
> stand-by | kafka_logical_slot | logical | t | t | pgoutput | replica_test | t | t | 0/6C000000 | | 2025-06-13 00:43:15.61492+00
>
> Old Replica ,
> node | slot_name | slot_type | temporary | active | plugin | database | failover | synced | restart_lsn | confirmed_flush_lsn | inactive_since
> ----------+--------------------+-----------+-----------+--------+----------+---------------+----------+--------+-------------+---------------------+-------------------------------
> stand-by | kafka_logical_slot | logical | f | f | pgoutput | replica_test | t | t | 0/6D000060 | 0/6D000098 | 2025-06-13 00:45:11.547671+00
>
>
> Not sure if any Pre-Req missing in my test environment. Or any limitation .
It may be a possibility that the slot is not sync-ready yet (and thus
not persisted) on new-replica due to primary having older values of
xmin and lsn. We do not allow persisting a synced slot if the
required WAL or catalog rows for this slot have been removed or are at
risk of removal on standby. The slot will be persisted in the next
few cycles of automatic slot-synchronization when it is ensured that
source slot's values are safe to be synced to the standby. But to
confirm my diagnosis, please provide this information:
1)
Output of this query on both primary and new-replica (where slot is temporary)
select slot_name, failover, synced, temporary, catalog_xmin,
restart_lsn, confirmed_flush_lsn from pg_replication_slots;
2)
Please check logs on new-replica to see the presence of log:
LOG: could not synchronize replication slot "kafka_logical_slot".
If found, please provide us with both the LOG and DETAIL messages
dumped in the log file.
thanks
Shveta
pgsql-hackers by date: