Re: logical decoding and replication of sequences, take 2 - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: logical decoding and replication of sequences, take 2 |
Date | |
Msg-id | CAA4eK1L4T85q0Yx=CDEtXG=Ti+zWHMnq1KgP7YuGTy+FuGRUyA@mail.gmail.com Whole thread Raw |
In response to | Re: logical decoding and replication of sequences, take 2 (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Responses |
Re: logical decoding and replication of sequences, take 2
|
List | pgsql-hackers |
On Mon, Mar 20, 2023 at 5:13 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > > On 3/20/23 12:00, Amit Kapila wrote: > > On Mon, Mar 20, 2023 at 1:49 PM Tomas Vondra > > <tomas.vondra@enterprisedb.com> wrote: > >> > >> > >> I don't understand why we'd need WAL from before the slot is created, > >> which happens before copy_sequence so the sync will see a more recent > >> state (reflecting all changes up to the slot LSN). > >> > > > > Imagine the following sequence of events: > > 1. Operation on a sequence seq-1 which requires WAL. Say, this is done > > at LSN 1000. > > 2. Some other random operations on unrelated objects. This would > > increase LSN to 2000. > > 3. Create a slot that uses current LSN 2000. > > 4. Copy sequence seq-1 where you will get the LSN value as 1000. Then > > you will use LSN 1000 as a starting point to start replication in > > sequence sync worker. > > > > It is quite possible that WAL from LSN 1000 may not be present. Now, > > it may be possible that we use the slot's LSN in this case but > > currently, it may not be possible without some changes in the slot > > machinery. Even, if we somehow solve this, we have the below problem > > where we can miss some concurrent activity. > > > > I think the question is what would be the WAL-requiring operation at LSN > 1000. If it's just regular nextval(), then we *will* see it during > copy_sequence - sequences are not transactional in the MVCC sense. > > If it's an ALTER SEQUENCE, I guess it might create a new relfilenode, > and then we might fail to apply this - that'd be bad. > > I wonder if we'd allow actually discarding the WAL while building the > consistent snapshot, though. > No, as soon as we reserve the WAL location, we update the slot's minLSN (replicationSlotMinLSN) which would prevent the required WAL from being removed. > You're however right we can't just decide > this based on LSN, we'd probably need to compare the relfilenodes too or > something like that ... > > >> I think the only "issue" are the WAL records after the slot LSN, or more > >> precisely deciding which of the decoded changes to apply. > >> > >> > >>> Now, for the second idea which is to directly use > >>> pg_current_wal_insert_lsn(), I think we won't be able to ensure that > >>> the changes covered by in-progress transactions like the one with > >>> Alter Sequence I have given example would be streamed later after the > >>> initial copy. Because the LSN returned by pg_current_wal_insert_lsn() > >>> could be an LSN after the LSN associated with Alter Sequence but > >>> before the corresponding xact's commit. > >> > >> Yeah, I think you're right - the locking itself is not sufficient to > >> prevent this ordering of operations. copy_sequence would have to lock > >> the sequence exclusively, which seems bit disruptive. > >> > > > > Right, that doesn't sound like a good idea. > > > > Although, maybe we could use a less strict lock level? I mean, one that > allows nextval() to continue, but would conflict with ALTER SEQUENCE. > I don't know if that is a good idea but are you imagining a special interface/mechanism just for logical replication because as far as I can see you have used SELECT to fetch the sequence values? -- With Regards, Amit Kapila.
pgsql-hackers by date: