On Wed, Jul 30, 2025 at 12:00 AM Doruk Yilmaz <doruk@mixrank.com> wrote:
>
> On Mon, Jul 29, 2025 at 8:13 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > That is true but I still feel there has to be some mechanism where we
> > can catch and give an ERROR to the user, if it doesn't follow the
> > same. For example, pg_replication_origin_advance() always allows going
> > backwards in terms of LSN which means if one doesn't follow commit
> > order, it can lead to breaking the replication as after restart the
> > client can ask to start replication from some prior point.
> If you have any ideas for safeguards or API changes, I'd be happy to
> help implement them or discuss them.
> > Can you tell us the use case? Did you also intend to use it for parallel apply, if so, can you also tell at a high
> > level, how you are planning to manage origin?
> Yes, we use it for parallel apply. We have a custom logical
> replication system that applies changes using multiple worker
> processes, each with their own database connection.
> Our use case requires multiple connections to be able to advance the
> same replication origin.
>
How do you advance the origin? Did you use
pg_replication_origin_advance()? If so, you should be aware that it
can be used for initial setup; see comment in that API code: "Can't
sensibly pass a local commit to be flushed at checkpoint - this xact
hasn't committed yet. This is why this function should be used to set
up the initial replication state, but not for replay." I wonder if you
are using pg_replication_origin_advance(), won't its current
implementation has the potential to cause a problem for your usecase?
I think the problem it can cause is it may miss a transaction to apply
after restart because we can use remote_lsn without a corresponding
transaction (local_lsn) flushed on the subscriber. This can happen
because ideally we want the transaction that is not successfully
flushed to be replayed after restart.
In general, I was thinking of adding a restriction
pg_replication_origin_advance() such that it gives an ERROR when a
user tries to move remote_lsn backward unless requested explicitly.
It would be good to know the opinion of others involved in the
original change of maintaining commit order for parallel apply of
large transactions.
--
With Regards,
Amit Kapila.