On Friday, September 12, 2025 4:48 PM shveta malik <shveta.malik@gmail.com> wrote:
> On Fri, Sep 12, 2025 at 8:55 AM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com>
> wrote:
> >
> >
> > I agree. Here is a V73 patch that will restart the worker if the
> > retention resumes. I also addressed other comments posted by Amit[1].
> >
>
> Thanks for the patch. Few comments:
Thanks for the comments!
>
> 1)
> There is a small window where worker can exit while resuming retention and
> launcher can end up acessign stale worker info.
>
> Lets say launcher is at a stage where it has fetched worker:
> w = logicalrep_worker_find(sub->oid, InvalidOid, false);
>
> And after this point, before the launcher could do
> compute_min_nonremovable_xid(), the worker has stopped retention and
> resumed as well. Now the worker has exited but in
> compute_min_nonremovable_xid(), launcher will still use the worker-info
> fetched previously.
Thanks for catching this, I have fixed by computing the xid under
LogicalRepWorkerLock.
>
> 2)
>
> if (should_stop_conflict_info_retention(rdt_data))
> + {
> + /*
> + * Stop retention if not yet. Otherwise, reset to the initial phase
> +to
> + * retry all phases. This is required to recalculate the current
> +wait
> + * time and resume retention if the time falls within
> + * max_retention_duration.
> + */
> + if (MySubscription->retentionactive)
> + rdt_data->phase = RDT_STOP_CONFLICT_INFO_RETENTION;
> + else
> + reset_retention_data_fields(rdt_data);
> +
> return;
> + }
>
>
>
> Shall we have an Assert( !MyLogicalRepWorker->oldest_nonremovable_xid)
> in 'else' part above?
Added.
Here is the V74 patch which addressed all comments.
Best Regards,
Hou zj