Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Re: Synchronizing slots from primary to standby
Date
Msg-id 18dbc929-a281-8552-4f1d-7e4d0e4eedba@enterprisedb.com
Whole thread Raw
In response to Synchronizing slots from primary to standby  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
Responses Re: Synchronizing slots from primary to standby
Re: Synchronizing slots from primary to standby
List pgsql-hackers
On 31.10.21 11:08, Peter Eisentraut wrote:
> I want to reactivate $subject.  I took Petr Jelinek's patch from [0], 
> rebased it, added a bit of testing.  It basically works, but as 
> mentioned in [0], there are various issues to work out.
> 
> The idea is that the standby runs a background worker to periodically 
> fetch replication slot information from the primary.  On failover, a 
> logical subscriber would then ideally find up-to-date replication slots 
> on the new publisher and can just continue normally.

> So, again, this isn't anywhere near ready, but there is already a lot 
> here to gather feedback about how it works, how it should work, how to 
> configure it, and how it fits into an overall replication and HA 
> architecture.

Here is an updated patch.  The main changes are that I added two 
configuration parameters.  The first, synchronize_slot_names, is set on 
the physical standby to specify which slots to sync from the primary. 
By default, it is empty.  (This also fixes the recovery test failures 
that I had to disable in the previous patch version.)  The second, 
standby_slot_names, is set on the primary.  It holds back logical 
replication until the listed physical standbys have caught up.  That 
way, when failover is necessary, the promoted standby is not behind the 
logical replication consumers.

In principle, this works now, I think.  I haven't made much progress in 
creating more test cases for this; that's something that needs more 
attention.

It's worth pondering what the configuration language for 
standby_slot_names should be.  Right now, it's just a list of slots that 
all need to be caught up.  More complicated setups are conceivable. 
Maybe you have standbys S1 and S2 that are potential failover targets 
for logical replication consumers L1 and L2, and also standbys S3 and S4 
that are potential failover targets for logical replication consumers L3 
and L4.  Viewed like that, this setting could be a replication slot 
setting.  The setting might also have some relationship with 
synchronous_standby_names.  Like, if you have synchronous_standby_names 
set, then that's a pretty good indication that you also want some or all 
of those standbys in standby_slot_names.  (But note that one is slots 
and one is application names.)  So there are a variety of possibilities.
Attachment

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: WIN32 pg_import_system_collations
Next
From: Peter Eisentraut
Date:
Subject: Re: Synchronizing slots from primary to standby