Re: Logical Replication of sequences - Mailing list pgsql-hackers
From | shveta malik |
---|---|
Subject | Re: Logical Replication of sequences |
Date | |
Msg-id | CAJpy0uDNQ4-X-hsSXGDwX+s9vbspu0N4t9bKRDjK95A-pcmBCw@mail.gmail.com Whole thread Raw |
In response to | Re: Logical Replication of sequences (vignesh C <vignesh21@gmail.com>) |
List | pgsql-hackers |
On Wed, Jul 9, 2025 at 4:11 PM vignesh C <vignesh21@gmail.com> wrote: > > > 3) > > SyncFetchRelationStates: > > Earlier the name was FetchTableStates. If we really want to use the > > 'Sync' keyword, we can name it FetchRelationSyncStates, as we are > > fetching sync-status only. Thoughts? > > Instead of FetchRelationSyncStates, I preferred FetchRelationStates, > and changed it to FetchRelationStates. > Okay, LGTM. > > 5) > > + if (!MyLogicalRepWorker->sequencesync_failure_time || > > + TimestampDifferenceExceeds(MyLogicalRepWorker->sequencesync_failure_time, > > + now, wal_retrieve_retry_interval)) > > + { > > + MyLogicalRepWorker->sequencesync_failure_time = 0; > > + > > + logicalrep_worker_launch(WORKERTYPE_SEQUENCESYNC, > > + MyLogicalRepWorker->dbid, > > + MySubscription->oid, > > + MySubscription->name, > > + MyLogicalRepWorker->userid, > > + InvalidOid, > > + DSM_HANDLE_INVALID); > > + break; > > + } > > > > We set sequencesync_failure_time to 0, but if logicalrep_worker_launch > > is not able to launch the worker due to some reason, next time it will > > not even wait for 'wal_retrieve_retry_interval time' to attempt > > restarting it again. Is that intentional? > > > > In other workflows such as while launching table-sync or apply worker, > > this scenario does not arise. This is because we maintain start_time > > there which can never be 0 instead of failure time and before > > attempting to start the workers, we set start_time to current time. > > The seq-sync failure-time OTOH is only set to non-null in > > logicalrep_seqsyncworker_failure() and it is not necessary that we > > will hit that function as the logicalrep_worker_launch() may fail > > before that itself. Do you think we shall maintain start-time instead > > of failure-time for seq-sync worker as well? Or is there any other way > > to handle it? > > I preferred the suggestion from [1]. Modified it accordingly. Okay, works for me. > The attached v20250709 version patch has the changes for the same. > Thank You for the patches. Please find a few comments: 1) Shall we update pg_publication doc as well to indicate that pubinsert, pubupdate, pubdelete , pubtruncate, pubviaroot are meaningful only when publishing tables. For sequences, these have no meaning. 2) Shall we have walrcv_disconnect() after copy is done in LogicalRepSyncSequences() 3) Do we really need for-loop in ProcessSyncingSequencesForApply? I think this function is inspired from ProcessSyncingTablesForApply() but there we need different tablesync workers for different tables. For sequences, that is not the case and thus for-loop can be omitted. If we do so, we can amend the comments too where it says " Walk over all subscription sequences....." 4) +# Confirm that the warning for parameters differing is logged. +$node_subscriber->wait_for_log( We can drop regress_seq_sub on the publisher now and check for missing warnings as the next step. 5) I am revisiting the test given in [1], I see there is some document change as: + Incremental sequence changes are not replicated. Although the data in + serial or identity columns backed by sequences will be replicated as part + of the table, the sequences themselves do not replicate ongoing changes. + On the subscriber, a sequence will retain the last value it synchronized + from the publisher. If the subscriber is used as a read-only database, + then this should typically not be a problem. If, however, some kind of + switchover or failover to the subscriber database is intended, then the + sequences would need to be updated to the latest values, either by + executing <link linkend="sql-altersubscription-params-refresh-publication-sequences"> + <command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES</command></link> + or by copying the current data from the publisher (perhaps using + <command>pg_dump</command>) or by determining a sufficiently high value + from the tables themselves. But this doc specifically mentions a failover case. It does not mention the case presented in [1] i.e. if user is trying to use sequence to populate identity column of a "subscribed" table where the sequence is also synced originally from publisher, then he may end up with corrupted state of IDENTITY column, and thus such cases should be used with caution. Please review once and see if we need to mention this and the example too. [1]: https://www.postgresql.org/message-id/CAJpy0uDK-QOULpd6x%2BisGrzwWyn16HHF0UPWqLGtOXQ-Z5M%3DyQ%40mail.gmail.com thanks Shveta
pgsql-hackers by date: