Re: [Patch] add new parameter to pg_replication_origin_session_setup - Mailing list pgsql-hackers
| From | shveta malik |
|---|---|
| Subject | Re: [Patch] add new parameter to pg_replication_origin_session_setup |
| Date | |
| Msg-id | CAJpy0uD6D294d=Hq4oROmtKAew5DfKERNxs=DsAwUFBFF2kERg@mail.gmail.com Whole thread Raw |
| In response to | RE: [Patch] add new parameter to pg_replication_origin_session_setup ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>) |
| List | pgsql-hackers |
On Tue, Jan 13, 2026 at 9:39 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Amit, Shveta,
>
> Thanks for suggestions. PSA new patches.
>
> > > Since user is not aware of internal acquired_by logic, the error might
> > > not make much sense to him as to why one session is able to reset
> > > while another is not. Shall we make it:
> > >
> > > ERROR: cannot reset replication origin "origin_name" while it is
> > > still shared by other processes
> > > DETAIL: the current session is the first process for this replication
> > > origin, and other processes are sharing it.
> > > HINT: ensure this replication origin is reset in all other processes first.
> > >
> >
> > How about a slightly tweaked version of these messages:
> > ERROR: cannot reset replication origin "origin_name" because it is
> > still in use by other processes
> > DETAIL: This session is the first process for this replication origin,
> > and other processes are currently sharing it.
> > HINT: Reset the replication origin in all other processes before retrying.
>
> I followed the Amit's idea, but the origin ID is used instead of origin name.
> I read other functions, and the name is directly output when 1) the specified
> origin does not exist or 2) the name is reserved. We seem to use ID as much as
> possible.
>
> >
> > > 2)
> > > When the first session leaves, while the second session is still using
> > > origin, the third session is able to drop the origin which is not
> > > right.
> > > I think replorigin_state_clear() needs a change.
> > > 'if (state->acquired_by != 0)' check should be replaced by 'if
> > > (state->refcount > 0)'
> > >
> > > Patch 001 had correct changes in replorigin_state_clear(), IMO we
> > > still need those
>
> Good finding. I put it in 0002 because it handles some cases related with
> acquired_by = 0.
>
> > >
> > > 3)
> > > When first session leaves, while second session is still using origin,
> > > now correctly third session is not able to join it. It gives error:
> > > postgres=# SELECT pg_replication_origin_session_setup('origin');
> > > ERROR: replication origin with ID 1 is already active for another process
> > >
> > > Error is not very informative provided the fact that now sharing is
> > > allowed. Shall it be:
> > >
> >
> > Yeah, sharing is allowed but only when used in parallel context by
> > passing PID. I think a slightly modified version of the above message
> > such as: "replication origin with ID 1 is already active in another
> > process" should be sufficient.
>
> Fixed but ereport() was used because I thought this is usar-facing. Feel free to
> change to elog() again based on your matter.
>
Thanks for the patch. All scenarios (known to me) seem to work well.
Few trivial comments:
1)
+step s1_reset: SELECT pg_replication_origin_session_reset();
After the above step, please add a step to attempt dropping the
replication origin. The original issue was that once s1 releases the
origin, it becomes eligible for dropping, so the test should
explicitly verify this behavior.
2)
Also before the above step, please add a step where s0 tries to reset
the origin while s1 is still acquiring it. It is needed to cover the
path where s0 should fail to release origin.
3)
+ /*
+ * Reset the PID only if the current backend is the first to set up this
+ * origin. This prevents resetting the PID when other backends are still
+ * using this origin.
+ */
The second sentence seems contradictory to the logic, as we are
resetting the PID here regardless of whether other backends are using
it.
Suggestion:
/*
* Reset the PID only if the current backend is the first to set up this
* origin. This avoids clearing the first process's PID when any other
* session releases the origin.
*/
4)
+ * PID was not noted in the origin. This can happen the process
+ * originally acquired the origin exits without releasing. To make the
+ * staus clean again, other processes cannot acquire the origin till
+ * all using ones release.
+ */
Slight tweak:
/*
* The origin is in use, but PID is not recorded. This can happen if
* the process that originally acquired the origin exited without releasing it.
* To ensure correctness, other processes cannot acquire the origin
* until all processes currently using it have released it.
thanks
Shveta
pgsql-hackers by date: