Re: VM corruption on standby - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: VM corruption on standby
Date
Msg-id CAPpHfduZCEBdL-VSN5p4udpn=iws1xhfje0tK-S21q2eR-9VnA@mail.gmail.com
Whole thread Raw
In response to Re: VM corruption on standby  (Alexander Korotkov <aekorotkov@gmail.com>)
Responses Re: VM corruption on standby
List pgsql-hackers
On Wed, Sep 3, 2025 at 11:28 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
>
> On Wed, Sep 3, 2025 at 9:47 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> >
> > > On 3 Sep 2025, at 11:37, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > >
> > > Could you, please, recheck?
> >
> > That patch also adds CondVar sleep in critical section. That patch is how we understood that such sleep is
dangerous.
> >
> > Actual patch to deteact a problem is much simpler:
> > ```
> > diff --git a/src/backend/storage/ipc/waiteventset.c
> > b/src/backend/storage/ipc/waiteventset.c
> > index 7c0e66900f9..e89e1d115cb 100644
> > --- a/src/backend/storage/ipc/waiteventset.c
> > +++ b/src/backend/storage/ipc/waiteventset.c
> > @@ -1044,6 +1044,7 @@ WaitEventSetWait(WaitEventSet *set, long timeout,
> >        long            cur_timeout = -1;
> >
> >        Assert(nevents > 0);
> > +       Assert(CritSectionCount == 0);
> >
> >        /*
> >         * Initialize timeout if requested.  We must record the current time so
> > ```
> >
> > Though it will fail in multixact test.
>
> I thought that patch allows to reproduce the problem of bc22dc0e0d.
> Sorry for the noise.

The attached is my attempt to rework the bc22dc0e0d.  The problem
discussed in this thread occurs because condition variable ends the
process with proc_exit(1) in the case of postmaster death even in the
critical section.  However, the shared memory cleanup happening after
proc_exit(1) being called in the critical section is leaving the
shared memory in the inconsistent state.  Generally I see two possible
behaviors in this situation.

1) Exit without the shared memory cleanup (e.g. proc_exit(2) as
proposed upthread).  That would save us from data corruption caused by
the inconsistent shared memory state.  However, if our process has
occupied shared resources (e.g. locks), they wouldn't be released.
That might cause other processes hand waiting for those resources.
2) Continue to wait after postmaster death.  While bc22dc0e0d was the
first case when we waited on conditional variable in the critical
section, there are a lot of cases where we're taking LWLock's in the
critical section.  Unlike condition variable, lwlock will continue to
wait in the case of postmaster death.  The critical section will
continue its operation until completion, postmaster death will be
handled afterwards.  If we would apply the same logic to the arbitrary
condition variable, it might however cause our process to stuck if
we're waiting for something from another process, which would exit
seeing postmaster death.

I think the approach #2 is more appropriate for bc22dc0e0d, because in
the critical section we only wait for other processes also in the
critical section (so, there is no risk they will exit immediately
after postmaster death making us stuck).  I've implemented a patch,
where waiting on conditional variable is replaced with LWLock-style
waiting on semaphore.  However, custom waiting code just for
AdvanceXLInsertBuffer() doesn't look good.  I believe we need some
general solution.  We might have a special kind of condition variable,
a critical section condition variable, where both waiting and
signaling must be invoked only in a critical section.  However, I dig
into our Latch and WaitEventSet, it seems there are too many
assumptions about postmaster death.  So, a critical section condition
variable probably should be implemented on top of semaphore.  Any
thoughts?

------
Regards,
Alexander Korotkov
Supabase

Attachment

pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: Proposal: Conflict log history table for Logical Replication
Next
From: Nazir Bilal Yavuz
Date:
Subject: Re: Mark ItemPointer arguments as const thoughoutly