Re: Postgres, fsync, and OSs (specifically linux) - Mailing list pgsql-hackers
From | Thomas Munro |
---|---|
Subject | Re: Postgres, fsync, and OSs (specifically linux) |
Date | |
Msg-id | CAEepm=16aauN3LMHrVZ-uoqU8-k7aoSdGC3t7PghewVVsjUwtQ@mail.gmail.com Whole thread Raw |
In response to | Re: Postgres, fsync, and OSs (specifically linux) (Thomas Munro <thomas.munro@enterprisedb.com>) |
List | pgsql-hackers |
On Fri, Nov 9, 2018 at 9:03 AM Thomas Munro <thomas.munro@enterprisedb.com> wrote: > On Fri, Nov 9, 2018 at 7:07 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Wed, Nov 7, 2018 at 9:41 PM Thomas Munro > > <thomas.munro@enterprisedb.com> wrote: > > > My plan is do a round of testing and review of this stuff next week > > > once the dust is settled on the current minor releases (including > > > fixing a few typos I just spotted and some word-smithing). All going > > > well, I will then push the resulting patches to master and all > > > supported stable branches, unless other reviews or objections appear. ... On Sun, Nov 18, 2018 at 3:20 PM Thomas Munro <thomas.munro@enterprisedb.com> wrote: > I think these patches are looking good now. If I don't spot any other > problems or hear any objections, I will commit them tomorrow-ish. Hearing no objections, pushed to all supported branches. Thank you to Craig for all his work getting to the bottom of this, to Andres for his open source diplomacy, and the Linux guys for their change "errseq: Always report a writeback error once" which came out of that. Some more comments: * The promotion of errors from close() to PANIC may or may not be effective considering that it doesn't have interlocking with concurrent checkpoints. I'm not sure if it can really happen on local file systems anyway... this may fall under the category of "making PostgreSQL work reliably on NFS", a configuration that is not recommended currently, and a separate project IMV. * In 9.4 and 9.5 there is no checking of errors from sync_file_range(), and I didn't add any for now. It was claimed that sync_file_range() without BEFORE/AFTER can't consume errors[1]. Errors are promoted in 9.6+ for consistency because we already looked at the return code, so we won't long rely on that knowledge in the long term. * I personally believe it is safe to run with data_sync_retry = on on any file system on FreeBSD, and ZFS on any operating system... but I see no need to make recommendations about that in the documentation, other than that you should investigate the behaviour of your operating system if you really want to turn it on. * A PANIC (and possibly ensuing crash restart loop if the I/O error is not transient) is of course a very unpleasant failure mode, but it is one that we already had for the WAL and control file. So I'm not sure I'd personally bother to run with the non-default setting even on a system where I believe it to be safe (considering the low likelihood that I/O failure is isolated to certain files); at best it probably gives you a better experience if the fs underneath a non-default tablespace dies. * The GUC is provided primarily because this patch is so drastic in its effect that it seems like we owe our users a way to disable it on principle, and that seems to outweigh a desire not to add GUCs in back-branches. * If I/O errors happen, your system is probably toast and you need to fail over or restore from backups, but at least we won't tell you any lies about checkpoints succeeding. In rare scenarios, perhaps involving a transient failure of virtualised storage with thin provisioning as originally described by Craig, the system may actually be able to continue running, and with this change we should now be able to avoid data loss by recovering from the WAL. * As noted the commit message, this isn't quite the end of the story. See the fsync queue redesign thread[2], WIP for master only. [1] https://www.postgresql.org/message-id/20180430160945.5s5qfoqryhtmugxl%40alap3.anarazel.de [2] https://www.postgresql.org/message-id/flat/CAEepm=2gTANm=e3ARnJT=n0h8hf88wqmaZxk0JYkxw+b21fNrw@mail.gmail.com -- Thomas Munro http://www.enterprisedb.com
pgsql-hackers by date: