Re: ReplicationSlotRelease() crashes when the instance is in the single user mode - Mailing list pgsql-hackers

From Robert Haas
Subject Re: ReplicationSlotRelease() crashes when the instance is in the single user mode
Date
Msg-id CA+TgmobspWhoMysNHL9b3C9xi=OpHdBYhtAgDH4N_A2foyjN-w@mail.gmail.com
Whole thread Raw
In response to Re: ReplicationSlotRelease() crashes when the instance is in the single user mode  (Paul A Jungwirth <pj@illuminatedcomputing.com>)
Responses RE: ReplicationSlotRelease() crashes when the instance is in the single user mode
List pgsql-hackers
On Tue, Aug 5, 2025 at 12:51 PM Paul A Jungwirth
<pj@illuminatedcomputing.com> wrote:
> No one has replied yet, but I vote for forbidding these functions. I
> can't articulate a full theory for which functions we restrict in
> single-user mode, and I think we should permit as much as possible.
> But any theory would weigh usefulness against risk. Here no one has
> found a use-case for several years, and executing user-supplied code
> in an unusual, higher-risk scenario seems like a good reason to be
> cautious. Also without tests for single-user mode, I'm extra wary. If
> single-user support is desired by someone, they could submit a patch.

I don't feel good about the direction from which this patch is
attacking the problem. The original stack trace looks like this:

postgres(ExceptionalCondition+0xab)[0xb86a2a]
postgres(ReplicationSlotRelease+0x5a)[0x8df10b]
postgres(pg_replication_slot_advance+0x330)[0x8e46ed]

ReplicationSlotRelease() is called near the end of
pg_replication_slot_advance. IIUC, this means that we successfully
acquired the slot and did all the work, and then failed a sanity check
afterward. So, I see two possibilities: either it's not OK to acquire
a replication slot in single-user mode, in which case this should have
failed earlier, or it's also OK to release it, in which case this
should not have failed at all. Interestingly, I see that
ReplicationSlotAcquire() features a special case whose specific
purpose is to allow it to work in single-user mode, which makes me
rather suspect that the latter is correct. It is possible that it was
intentional that acquiring works in single-user mode and dropping
(other than via ReplicationSlotDropAcquired) does not, but so far I
see nothing in the comments suggesting that.

In any event, that inconsistency between how ReplicationSlotAcquire()
works and how ReplicationSlotRelease() works seems to me to be the
core thing that should be fixed here. If the selected fix also
requires error checks in higher-level functions, as added by this
patch, then well and good. But right now, the proposed patch isn't
trying to fix the definitional inconsistency in the underlying layer
at all, and is instead just installing guard rails to try to keep
anyone from bumping into it. That seems like a bad direction to go --
it feels like we're blocking access to potentially-useful
functionality because of an assertion failure that might just be
slightly sloppy coding.

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Greg Sabino Mullane
Date:
Subject: Re: Correction of RowMark Removal During Sel-Join Elimination
Next
From: Greg Sabino Mullane
Date:
Subject: Re: psql: Count all table footer lines in pager setup