Re: Conflict detection for update_deleted in logical replication - Mailing list pgsql-hackers
From | Dilip Kumar |
---|---|
Subject | Re: Conflict detection for update_deleted in logical replication |
Date | |
Msg-id | CAFiTN-v8GXcxaQm1H_cdfnx35rRsYTW41Ms8wo4t9XyOg_7EPQ@mail.gmail.com Whole thread Raw |
In response to | Re: Conflict detection for update_deleted in logical replication (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
RE: Conflict detection for update_deleted in logical replication
|
List | pgsql-hackers |
On Mon, Aug 18, 2025 at 10:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > Given that max_conflict_retention_duration works only when > > retain_dead_tuples is enabled, why not merge these two parameters? For > > example, setting max_conflict_retention_duration=-1 means to disable > > retaining dead tuples behavior and =0 means that dead tuples are > > retained until they are no longer needed for detection purposes. > > > > I think it can be a source of confusion for users, if not now, then in > the future. Consider that in the future, we also have a GUC to set the > retention duration which will be used for all subscriptions. Now, say, > we define the behaviour such that if this value is set for > subscription, then use that, otherwise, use the GUC value. Then, with > this proposal, if the user sets max_conflict_retention_duration=0, it > will lead to retaining tuples until they are no longer needed but with > the behaviour proposed in patch, one could have simply set > retain_dead_tuples=true and used the GUC value. I understand that it > is debatable how we will design the GUC behaviour in future but I used > it as an example how trying to achieve different things with one > option can be a source of confusion. Even if we decide not to > introduce GUC or define its behaviour differently, I find having > different options in this case is easy to understand and use. I agree that merging these 2 will create confusion and usability issues. > > > --- > > + /* > > + * Only the leader apply worker manages conflict retention (see > > + * maybe_advance_nonremovable_xid() for details). > > + */ > > + if (!isParallelApplyWorker(&worker) && !isTablesyncWorker(&worker)) > > + values[10] = TransactionIdIsValid(worker.oldest_nonremovable_xid); > > + else > > + nulls[10] = true; > > > > I think that adding a new column to the pg_stat_subscription view > > should be implemented in a separate patch since it needs to bump the > > catalog version while introducing max_conflict_retention_duration > > subscription option doesn't require that. > > > > Won't following change in pg_subsciption.h anyway requires us to bump > catversion? > @@ -81,6 +81,10 @@ > CATALOG(pg_subscription,6100,SubscriptionRelationId) > BKI_SHARED_RELATION BKI_ROW > bool subretaindeadtuples; /* True if dead tuples useful for > * conflict detection are retained */ > > + int32 maxconflretention; /* The maximum duration (in milliseconds) > + * for which information useful for > + * conflict detection can be retained */ > + > > > --- > > Even if an apply worker disables retaining dead tuples due to > > max_conflict_retention_duration, it enables again after the server > > restarts. > > > > I also find this behaviour questionable because this also means that > it is possible that before restart one would deduce that the > update_deleted conflict won't be reliably detected for a particular > subscription but after restart it could lead to the opposite > conclusion. But note that to make it behave similarly we need to store > this value persistently in pg_subscription unless you have better > ideas for this. Theoretically, there are two places where we can > persist this information, one is with pg_subscription, and other in > origin. I find it is closer to pg_subscription. I think it makes sense to store this in pg_subscription to preserve the decision across restart. -- Regards, Dilip Kumar Google
pgsql-hackers by date: