Re: logical decoding and replication of sequences, take 2 - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: logical decoding and replication of sequences, take 2 |
Date | |
Msg-id | 9386d80a-ca2d-2d10-9617-d83341b7d3c3@enterprisedb.com Whole thread Raw |
In response to | Re: logical decoding and replication of sequences, take 2 (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: logical decoding and replication of sequences, take 2
|
List | pgsql-hackers |
On 3/20/23 04:42, Amit Kapila wrote: > On Sat, Mar 18, 2023 at 8:49 PM Tomas Vondra > <tomas.vondra@enterprisedb.com> wrote: >> >> On 3/18/23 06:35, Amit Kapila wrote: >>> On Sat, Mar 18, 2023 at 3:13 AM Tomas Vondra >>> <tomas.vondra@enterprisedb.com> wrote: >>>> >>>> ... >>>> >>>> Clearly, for sequences we can't quite rely on snapshots/slots, we need >>>> to get the LSN to decide what changes to apply/skip from somewhere else. >>>> I wonder if we can just ignore the queued changes in tablesync, but I >>>> guess not - there can be queued increments after reading the sequence >>>> state, and we need to apply those. But maybe we could use the page LSN >>>> from the relfilenode - that should be the LSN of the last WAL record. >>>> >>>> Or maybe we could simply add pg_current_wal_insert_lsn() into the SQL we >>>> use to read the sequence state ... >>>> >>> >>> What if some Alter Sequence is performed before the copy starts and >>> after the copy is finished, the containing transaction rolled back? >>> Won't it copy something which shouldn't have been copied? >>> >> >> That shouldn't be possible - the alter creates a new relfilenode and >> it's invisible until commit. So either it gets committed (and then >> replicated), or it remains invisible to the SELECT during sync. >> > > Okay, however, we need to ensure that such a change will later be > replicated and also need to ensure that the required WAL doesn't get > removed. > > Say, if we use your first idea of page LSN from the relfilenode, then > how do we ensure that the corresponding WAL doesn't get removed when > later the sync worker tries to start replication from that LSN? I am > imagining here the sync_sequence_slot will be created before > copy_sequence but even then it is possible that the sequence has not > been updated for a long time and the LSN location will be in the past > (as compared to the slot's LSN) which means the corresponding WAL > could be removed. Now, here we can't directly start using the slot's > LSN to stream changes because there is no correlation of it with the > LSN (page LSN of sequence's relfilnode) where we want to start > streaming. > I don't understand why we'd need WAL from before the slot is created, which happens before copy_sequence so the sync will see a more recent state (reflecting all changes up to the slot LSN). I think the only "issue" are the WAL records after the slot LSN, or more precisely deciding which of the decoded changes to apply. > Now, for the second idea which is to directly use > pg_current_wal_insert_lsn(), I think we won't be able to ensure that > the changes covered by in-progress transactions like the one with > Alter Sequence I have given example would be streamed later after the > initial copy. Because the LSN returned by pg_current_wal_insert_lsn() > could be an LSN after the LSN associated with Alter Sequence but > before the corresponding xact's commit. Yeah, I think you're right - the locking itself is not sufficient to prevent this ordering of operations. copy_sequence would have to lock the sequence exclusively, which seems bit disruptive. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: