Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly
Date
Msg-id CAPpHfdsk9_mrZaR_twwe6ntFUYCwu9SM1t4n6SmWNTZDZGefoQ@mail.gmail.com
Whole thread Raw
In response to Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly  ("Vitaly Davydov" <v.davydov@postgrespro.ru>)
List pgsql-hackers
Hi, Amit!

Thank you for your attention to this patchset!

On Sat, May 24, 2025 at 2:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Sat, May 24, 2025 at 4:08 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> >
> > I spend more time on this.  The next revision is attached.  It
> > contains revised comments and other cosmetic changes.  I'm going to
> > backpatch 0001 to all supported branches,
> >
>
> Is my understanding correct that we need 0001 because
> PhysicalConfirmReceivedLocation() doesn't save the slot to disk after
> changing the slot's restart_lsn?

Yes.  Also, even if it would save slot to the disk, there is still
race condition that concurrent checkpoint could use updated value from
the shared memory to clean old WAL segments, and then crash happens
before we managed to write the slot to the disk.

> If so, shouldn't the comments (One
> could argue that the slot should be saved to disk now, but that'd be
> energy wasted - the worst thing lost information could cause here is
> to give wrong information in a statistics view) in
> PhysicalConfirmReceivedLocation() be changed to mention the risk of
> not saving the slot?
>
> Also, after 0001, even the same solution will be true for logical
> slots as well, where we are already careful to save the slot to disk
> if its restart_lsn is changed, see LogicalConfirmReceivedLocation().
> So, won't that effort be wasted? Even if we don't want to do anything
> about it (which doesn't sound like a good idea), we should note that
> in comments somewhere.

I have added the comments about both points in the attached revision
of the patchset.

------
Regards,
Alexander Korotkov
Supabase

Attachment

pgsql-hackers by date:

Previous
From: Álvaro Herrera
Date:
Subject: Re: mention unused_oids script in pg_proc.dat
Next
From: Hannu Krosing
Date:
Subject: [PATCH] Extending FK check skipping on replicas to ADD FK and TRUNCATE