Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date
Msg-id CAEepm=1KFaVPdOxYkP6bmtevOZHfdHTNf8bjZWSkJxoxy0X+7A@mail.gmail.com
Whole thread Raw
In response to Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS  (Catalin Iacob <iacobcatalin@gmail.com>)
Responses Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
List pgsql-hackers
On Fri, Mar 30, 2018 at 5:20 AM, Catalin Iacob <iacobcatalin@gmail.com> wrote:
> Jeff's comments in the pull request that merged errseq_t are worth
> reading as well:
>
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=088737f44bbf6378745f5b57b035e57ee3dc4750

Wow.  It looks like there may be a separate question of when each
filesystem adopted this new infrastructure?

>> Yeah, I see why you want to PANIC.
>
> Indeed. Even doing that leaves question marks about all the kernel
> versions before v4.13, which at this point is pretty much everything
> out there, not even detecting this reliably. This is messy.

The pre-errseq_t problems are beyond our control.  There's nothing we
can do about that in userspace (except perhaps abandon OS-buffered IO,
a big project).  We just need to be aware that this problem exists in
certain kernel versions and be grateful to Layton for fixing it.

The dropped dirty flag problem is something we can and in my view
should do something about, whatever we might think about that design
choice.  As Andrew Gierth pointed out to me in an off-list chat about
this, by the time you've reached this state, both PostgreSQL's buffer
and the kernel's buffer are clean and might be reused for another
block at any time, so your data might be gone from the known universe
-- we don't even have the option to rewrite our buffers in general.
Recovery is the only option.

Thank you to Craig for chasing this down and +1 for his proposal, on Linux only.

-- 
Thomas Munro
http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: [HACKERS] Replication status in logical replication
Next
From: Tom Lane
Date:
Subject: Re: Changing WAL Header to reduce contention during ReserveXLogInsertLocation()