Re: Conflict detection for update_deleted in logical replication - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Conflict detection for update_deleted in logical replication
Date
Msg-id CAFiTN-v8GXcxaQm1H_cdfnx35rRsYTW41Ms8wo4t9XyOg_7EPQ@mail.gmail.com
Whole thread Raw
In response to Re: Conflict detection for update_deleted in logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
Responses RE: Conflict detection for update_deleted in logical replication
List pgsql-hackers
On Mon, Aug 18, 2025 at 10:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>

> > Given that max_conflict_retention_duration works only when
> > retain_dead_tuples is enabled, why not merge these two parameters? For
> > example, setting max_conflict_retention_duration=-1 means to disable
> > retaining dead tuples behavior and =0 means that dead tuples are
> > retained until they are no longer needed for detection purposes.
> >
>
> I think it can be a source of confusion for users, if not now, then in
> the future. Consider that in the future, we also have a GUC to set the
> retention duration which will be used for all subscriptions. Now, say,
> we define the behaviour such that if this value is set for
> subscription, then use that, otherwise, use the GUC value. Then, with
> this proposal, if the user sets max_conflict_retention_duration=0, it
> will lead to retaining tuples until they are no longer needed but with
> the behaviour proposed in patch, one could have simply set
> retain_dead_tuples=true and used the GUC value. I understand that it
> is debatable how we will design the GUC behaviour in future but I used
> it as an example how trying to achieve different things with one
> option can be a source of confusion. Even if we decide not to
> introduce GUC or define its behaviour differently, I find having
> different options in this case is easy to understand and use.

I agree that merging these 2 will create confusion and usability issues.

>
> > ---
> > +       /*
> > +        * Only the leader apply worker manages conflict retention (see
> > +        * maybe_advance_nonremovable_xid() for details).
> > +        */
> > +       if (!isParallelApplyWorker(&worker) && !isTablesyncWorker(&worker))
> > +           values[10] = TransactionIdIsValid(worker.oldest_nonremovable_xid);
> > +       else
> > +           nulls[10] = true;
> >
> > I think that adding a new column to the pg_stat_subscription view
> > should be implemented in a separate patch since it needs to bump the
> > catalog version while introducing max_conflict_retention_duration
> > subscription option doesn't require that.
> >
>
> Won't following change in pg_subsciption.h anyway requires us to bump
> catversion?
> @@ -81,6 +81,10 @@
> CATALOG(pg_subscription,6100,SubscriptionRelationId)
> BKI_SHARED_RELATION BKI_ROW
>   bool subretaindeadtuples; /* True if dead tuples useful for
>   * conflict detection are retained */
>
> + int32 maxconflretention; /* The maximum duration (in milliseconds)
> + * for which information useful for
> + * conflict detection can be retained */
> +
>
> > ---
> > Even if an apply worker disables retaining dead tuples due to
> > max_conflict_retention_duration, it enables again after the server
> > restarts.
> >
>
> I also find this behaviour questionable because this also means that
> it is possible that before restart one would deduce that the
> update_deleted conflict won't be reliably detected for a particular
> subscription but after restart it could lead to the opposite
> conclusion. But note that to make it behave similarly we need to store
> this value persistently in pg_subscription unless you have better
> ideas for this. Theoretically, there are two places where we can
> persist this information, one is with pg_subscription, and other in
> origin. I find it is closer to pg_subscription.

I think it makes sense to store this in pg_subscription to preserve
the decision across restart.

--
Regards,
Dilip Kumar
Google



pgsql-hackers by date:

Previous
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: memory leak in logical WAL sender with pgoutput's cachectx
Next
From: Chao Li
Date:
Subject: Re: GB18030-2022 Support in PostgreSQL