Re: FSM Corruption (was: Could not read block at end of the relation) - Mailing list pgsql-bugs

From Thomas Munro
Subject Re: FSM Corruption (was: Could not read block at end of the relation)
Date
Msg-id CA+hUKGLGtG8VoTDipK_YvRhgP=qGHEFaSOPWctN-rQ5ftWQQMA@mail.gmail.com
Whole thread Raw
In response to Re: FSM Corruption (was: Could not read block at end of the relation)  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-bugs
On Fri, Apr 12, 2024 at 4:01 AM Peter Geoghegan <pg@bowt.ie> wrote:
> Although it's not related to the problem you're working on, it seems
> like a good opportunity to bring up a concern about the FSM that I
> don't believe was discussed at any point in the past few years: I
> wonder if the way that fsm_search_avail() sometimes updates
> fsmpage->fp_next_slot with only a shared lock on the page could cause
> problems. At the very least, it's weird that we allow it.

Aha.  Good to know.  So that is another place where direct I/O on a
file system with checksums might get very upset, if it takes no
measures of its own to prevent the data from changing underneath it
during a pwrite() call.  The only known system like that so far is
btrfs (phenemon #1 in [1], see reproducer).  The symptom is that the
next read fails with EIO.

[1] https://www.postgresql.org/message-id/CA%2BhUKGKSBaz78Fw3WTF3Q8ArqKCz1GgsTfRFiDPbu-j9OFz-jw%40mail.gmail.com



pgsql-bugs by date:

Previous
From: Noah Misch
Date:
Subject: Re: FSM Corruption (was: Could not read block at end of the relation)
Next
From: Richard Guo
Date:
Subject: Re: BUG #18422: Assert in expandTupleDesc() fails on row mismatch with additional SRF