Thread: Check for existing replication slot in pg_createsubscriber

Check for existing replication slot in pg_createsubscriber

From
Zane Duffield
Date:
Hackers,

I noticed in testing and usage that pg_createsubscriber doesn't check for an existing replication slot before attempting to create one, whereas it *does* check for existing publications.
If it were to check for an existing replication slot, then the --dry-run mode would be able to detect the issue.

If this seems like a good feature, I'm happy to try and put together a patch for it.

Thanks,
Zane

Re: Check for existing replication slot in pg_createsubscriber

From
Amit Kapila
Date:
On Fri, Jun 27, 2025 at 1:13 PM Zane Duffield <duffieldzane@gmail.com> wrote:
>
> Hackers,
>
> I noticed in testing and usage that pg_createsubscriber doesn't check for an existing replication slot before
attemptingto create one, whereas it *does* check for existing publications. 
>

I see the difference you are pointing to. Ideally, the checks should
be the same unless there is a specific reason for them to be
different, which should be mentioned in the comments. BTW, do you see
any problems due to name conflicts while using this tool, or is it a
code-level observation? AFAICS, the names for the objects created by
pg_subscriber are either generated names (with an intention that it
doesn't conflict) or user-provided. In both cases, chances should be
less that they conflict with existing objects.

--
With Regards,
Amit Kapila.



Re: Check for existing replication slot in pg_createsubscriber

From
Zane Duffield
Date:
On Mon, Jun 30, 2025 at 1:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

I see the difference you are pointing to. Ideally, the checks should
be the same unless there is a specific reason for them to be
different, which should be mentioned in the comments. BTW, do you see
any problems due to name conflicts while using this tool, or is it a
code-level observation? 

In my case the --subscription and --replication-slot options are used to control the identifiers; the conflict was the user's fault, not the program's.

Re: Check for existing replication slot in pg_createsubscriber

From
Amit Kapila
Date:
On Mon, Jun 30, 2025 at 8:37 AM Zane Duffield <duffieldzane@gmail.com> wrote:
>
> On Mon, Jun 30, 2025 at 1:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>>
>> I see the difference you are pointing to. Ideally, the checks should
>> be the same unless there is a specific reason for them to be
>> different, which should be mentioned in the comments. BTW, do you see
>> any problems due to name conflicts while using this tool, or is it a
>> code-level observation?
>
>
> In my case the --subscription and --replication-slot options are used to control the identifiers; the conflict was
theuser's fault, not the program's. 
>

Okay, I find your case a good reason to add such a check, apart from
making the code consistent in terms of these checks. One thing I was
thinking is whether it makes sense to add these checks only in
--dry-run mode because we normally don't expect such conflicts.
Otherwise, each such check adds an additional network round-trip.

--
With Regards,
Amit Kapila.



Re: Check for existing replication slot in pg_createsubscriber

From
Zane Duffield
Date:
On Mon, Jun 30, 2025 at 1:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
One thing I was
thinking is whether it makes sense to add these checks only in
--dry-run mode because we normally don't expect such conflicts.
Otherwise, each such check adds an additional network round-trip.

I did wonder why it bothered checking for conflicts before running the command that would fail in case of a conflict.
It makes sense to me to only check for conflicts in --dry-run mode.

Thanks,
Zane

RE: Check for existing replication slot in pg_createsubscriber

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Amit, Zane,

> Okay, I find your case a good reason to add such a check, apart from
> making the code consistent in terms of these checks. One thing I was
> thinking is whether it makes sense to add these checks only in
> --dry-run mode because we normally don't expect such conflicts.
> Otherwise, each such check adds an additional network round-trip.

I think there are two things which can be checked on the command side. One task is
to see the duplication of names. It can be done by connecting to nodes once and
run SQLs. To avoid the round-trip, this could be added for dry run mode.

Another one is slot-name validation. For now, it completely relies on the publisher
side, but it is better to detect earlier. Currently ReplicationSlotValidateName()
does the validation, and we can move it to under common/ directory to allow
server/client side can use the function. This does very fundamental validation
for the string and may be able to do in both dry run/normal mode.

Best regards,
Hayato Kuroda
FUJITSU LIMITED