Re: synchronized_standby_slots used in logical replication - Mailing list pgsql-hackers
From | shveta malik |
---|---|
Subject | Re: synchronized_standby_slots used in logical replication |
Date | |
Msg-id | CAJpy0uCL3qGPs=UKSazp0Dqre5dd=PNp9euEQLN+jgSGY=BXsA@mail.gmail.com Whole thread Raw |
In response to | synchronized_standby_slots used in logical replication (Fabrice Chapuis <fabrice636861@gmail.com>) |
Responses |
Re: synchronized_standby_slots used in logical replication
|
List | pgsql-hackers |
On Wed, Jun 4, 2025 at 4:01 PM Fabrice Chapuis <fabrice636861@gmail.com> wrote: > > Hi, > > I'm working with logical replication in a PostgreSQL 17 setup, and I'm exploring the new synchronized_standby_slots parameterto make replication slots failover safe in a highly available environment using physical standby nodes managed byPatroni. > > While testing this feature, I encountered a blocking behavior, when a standby is listed in synchronized_standby_slots andthat standby goes offline, logical replication on the primary stops progressing. From what I understand, the primary nodewaits for the standby to acknowledge received wal records, effectively stalling WAL decoding for the logical slot. Inoticed that the failover slot on the standby continue to be synced. Yes, your understanding is correct. > > This raises several questions about the tradeoffs and implications of using this feature: > > What are the risks or limitations if synchronized_standby_slots is left empty (the default)? Is there a risk of data lossor inconsistency for logical subscribers in such cases? If the 'synchronized_standby_slots' setting is left unset, logical replication subscribers may progress ahead of the physical standby servers. In the event of a failover under such conditions, the new primary might lack the necessary data to continue supporting logical replication, even if synchronized slots are in place, resulting in unexpected behavior. Therefore, it is strongly recommended to configure 'synchronized_standby_slots' properly to ensure that all configured physical standbys have received and flushed the changes before those changes are made visible to logical replication subscribers. > Is it expected behavior that any failure of a standby listed in synchronized_standby_slots stalls logical decoding on theprimary? If so, are there any ways to avoid blocking WAL decoding while still having slot synchronization? Yes, this is expected behavior. It is similar to how 'synchronous_standby_names' works, where a commit on the primary is allowed to proceed only after the configured standby servers acknowledge receipt of the data. The main difference is that 'synchronous_standby_names' provides more configuration options, such as FIRST and ANY, allowing the system to wait for a subset of standbys rather than all of them. However, if none of the configured standbys are available, the primary will still wait, just like in this case until a standby becomes available or the configuration is changed. In the future, if needed, similar flexibility (e.g., support for ANY, FIRST) could potentially be extended to 'synchronized_standby_slots' as well. For now, the way to move forward is either by updating the configuration or by restoring the standby to an operational state. > Patroni is managing FO slots better than native Postgres impletmentation? I'm not entirely certain about that. However, PostgreSQL does handle several complex scenarios, such as: --Ensuring seamless logical replication on failover by allowing users to configure potential failover candidates via synchronized_standby_slots, making synced slots ready for failover in all the situations. --To ensure consistency, we avoid direct copy of slot unless a consistent point could be reached with the new values. Otherwise after promotion, the slots may not reach a consistent point, potentially resulting in data loss. --Supporting two-phase transactions for failover slots, where transactions prepared before two_phase decoding is enabled are handled correctly even if the failover occurs immediately afterward. You may want to check with the Patroni community for more detailed insights. We're open to considering any gaps or missing functionality in PostgreSQL as well. thanks Shveta
pgsql-hackers by date: