Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Synchronizing slots from primary to standby |
Date | |
Msg-id | 20220218222319.yozkbhren7vkjbi5@alap3.anarazel.de Whole thread Raw |
In response to | Re: Synchronizing slots from primary to standby (Peter Eisentraut <peter.eisentraut@enterprisedb.com>) |
Responses |
RE: Synchronizing slots from primary to standby
Re: Synchronizing slots from primary to standby Re: Synchronizing slots from primary to standby |
List | pgsql-hackers |
Hi, On 2022-02-11 15:28:19 +0100, Peter Eisentraut wrote: > On 05.02.22 20:59, Andres Freund wrote: > > On 2022-01-03 14:46:52 +0100, Peter Eisentraut wrote: > > > From ec00dc6ab8bafefc00e9b1c78ac9348b643b8a87 Mon Sep 17 00:00:00 2001 > > > From: Peter Eisentraut<peter@eisentraut.org> > > > Date: Mon, 3 Jan 2022 14:43:36 +0100 > > > Subject: [PATCH v3] Synchronize logical replication slots from primary to > > > standby > > I've just skimmed the patch and the related threads. As far as I can tell this > > cannot be safely used without the conflict handling in [1], is that correct? > > This or similar questions have been asked a few times about this or similar > patches, but they always come with some doubt. I'm certain it's a problem - the only reason I couched it was that there could have been something clever in the patch preventing problems that I missed because I just skimmed it. > If we think so, it would be > useful perhaps if we could come up with test cases that would demonstrate > why that other patch/feature is necessary. (I'm not questioning it > personally, I'm just throwing out ideas here.) The patch as-is just breaks one of the fundamental guarantees necessary for logical decoding, that no rows versions can be removed that are still required for logical decoding (signalled via catalog_xmin). So there needs to be an explicit mechanism upholding that guarantee, but there is not right now from what I can see. One piece of the referenced patchset is that it adds information about removed catalog rows to a few WAL records, and then verifies during replay that no record can be replayed that removes resources that are still needed. If such a conflict exists it's dealt with as a recovery conflict. That itself doesn't provide prevention against removal of required, but it provides detection. The prevention against removal can then be done using a physical replication slot with hot standby feedback or some other mechanism (e.g. slot syncing mechanism could maintain a "placeholder" slot on the primary for all sync targets or something like that). Even if that infrastructure existed / was merged, the slot sync stuff would still need some very careful logic to protect against problems due to concurrent WAL replay and "synchronized slot" creation. But that's doable. Greetings, Andres Freund
pgsql-hackers by date: