Home > mailing lists

Re: Logical Replication of sequences - Mailing list pgsql-hackers

From	shveta malik
Subject	Re: Logical Replication of sequences
Date	July 11, 2025 11:56:28
Msg-id	CAJpy0uDNQ4-X-hsSXGDwX+s9vbspu0N4t9bKRDjK95A-pcmBCw@mail.gmail.com Whole thread
In response to	Re: Logical Replication of sequences (vignesh C <vignesh21@gmail.com>)
Responses	Re: Logical Replication of sequences
List	pgsql-hackers

Tree view

On Wed, Jul 9, 2025 at 4:11 PM vignesh C <vignesh21@gmail.com> wrote:

>
> > 3)
> > SyncFetchRelationStates:
> > Earlier the name was FetchTableStates. If we really want to  use the
> > 'Sync' keyword, we can name it FetchRelationSyncStates, as we are
> > fetching sync-status only. Thoughts?
>
> Instead of FetchRelationSyncStates, I preferred FetchRelationStates,
> and changed it to FetchRelationStates.
>

Okay, LGTM.

> > 5)
> > + if (!MyLogicalRepWorker->sequencesync_failure_time ||
> > + TimestampDifferenceExceeds(MyLogicalRepWorker->sequencesync_failure_time,
> > +    now, wal_retrieve_retry_interval))
> > + {
> > + MyLogicalRepWorker->sequencesync_failure_time = 0;
> > +
> > + logicalrep_worker_launch(WORKERTYPE_SEQUENCESYNC,
> > + MyLogicalRepWorker->dbid,
> > + MySubscription->oid,
> > + MySubscription->name,
> > + MyLogicalRepWorker->userid,
> > + InvalidOid,
> > + DSM_HANDLE_INVALID);
> > + break;
> > + }
> >
> > We set sequencesync_failure_time to 0, but if logicalrep_worker_launch
> > is not able to launch the worker due to some reason, next time it will
> > not even wait for 'wal_retrieve_retry_interval time' to attempt
> > restarting it again. Is that intentional?
> >
> > In other workflows such as while launching table-sync or apply worker,
> >  this scenario does not arise. This is because we maintain start_time
> > there which can never be 0 instead of failure time and before
> > attempting to start the workers, we set start_time to current time.
> > The seq-sync failure-time OTOH is only set to non-null in
> > logicalrep_seqsyncworker_failure() and it is not necessary that we
> > will hit that function as the  logicalrep_worker_launch() may fail
> > before that itself. Do you think we shall maintain start-time instead
> > of failure-time for seq-sync worker as well? Or is there any other way
> > to handle it?
>
> I preferred the suggestion from [1]. Modified it accordingly.

Okay, works for me.

> The attached v20250709 version patch has the changes for the same.
>

Thank You for the patches. Please find a few comments:

1)
Shall we update pg_publication doc as well to indicate that pubinsert,
pubupdate, pubdelete , pubtruncate, pubviaroot are meaningful only
when publishing tables. For sequences, these have no meaning.

2)
Shall we have walrcv_disconnect() after copy is done in
LogicalRepSyncSequences()

3)
Do we really need for-loop in ProcessSyncingSequencesForApply? I think
this function is inspired from ProcessSyncingTablesForApply() but
there we need different tablesync workers for different tables. For
sequences, that is not the case and thus for-loop can be omitted. If
we do so, we can amend the comments too where it says " Walk over all
subscription sequences....."

4)
+# Confirm that the warning for parameters differing is logged.
+$node_subscriber->wait_for_log(

We can drop regress_seq_sub on the publisher now and check for missing
warnings as the next step.


5)
I am revisiting the test given in [1], I see there is some document change as:

+     Incremental sequence changes are not replicated.  Although the data in
+     serial or identity columns backed by sequences will be replicated as part
+     of the table, the sequences themselves do not replicate ongoing changes.
+     On the subscriber, a sequence will retain the last value it synchronized
+     from the publisher. If the subscriber is used as a read-only database,
+     then this should typically not be a problem.  If, however, some kind of
+     switchover or failover to the subscriber database is intended, then the
+     sequences would need to be updated to the latest values, either by
+     executing <link
linkend="sql-altersubscription-params-refresh-publication-sequences">
+     <command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION
SEQUENCES</command></link>
+     or by copying the current data from the publisher (perhaps using
+     <command>pg_dump</command>) or by determining a sufficiently high value
+     from the tables themselves.

But this doc specifically mentions a failover case. It does not
mention the case presented in [1] i.e. if user is trying to use
sequence to populate identity column of a "subscribed" table where the
sequence is also synced originally from publisher, then he may end up
with corrupted state
of IDENTITY column, and thus such cases should be used with caution.
Please review once and see if we need to mention this and the example
too.

[1]: https://www.postgresql.org/message-id/CAJpy0uDK-QOULpd6x%2BisGrzwWyn16HHF0UPWqLGtOXQ-Z5M%3DyQ%40mail.gmail.com

thanks
Shveta

pgsql-hackers by date:

From: Amit Kapila
Date: 11 July 2025, 11:52:39
Subject: Re: Logical replication prefetch

From: "Robin Haberkorn"
Date: 11 July 2025, 12:15:56
Subject: Re: Adding error messages to a few slash commands

Re: Logical Replication of sequences - Mailing list pgsql-hackers

Previous

Next