Re: Postgres, fsync, and OSs (specifically linux) - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Postgres, fsync, and OSs (specifically linux) |
Date | |
Msg-id | CA+TgmoYoVB8LSJ0mufU_N0G3YHw1ZSJVCMXk06GKZaDuboGHjw@mail.gmail.com Whole thread Raw |
In response to | Re: Postgres, fsync, and OSs (specifically linux) (Craig Ringer <craig@2ndquadrant.com>) |
Responses |
Re: Postgres, fsync, and OSs (specifically linux)
|
List | pgsql-hackers |
On Tue, May 29, 2018 at 4:53 AM, Craig Ringer <craig@2ndquadrant.com> wrote: > I've revised the fsync patch with the cleanups discussed and gone through > the close() calls. > > AFAICS either socket closes, temp file closes, or (for WAL) already PANIC on > close. It's mainly fd.c that needs amendment. Which I've done per the > attached revised patch. I think we should have a separate thread for this patch vs. Andres's patch to do magic things with the checkpointer and file-descriptor forwarding. Meanwhile, here's some review. 1. No longer applies cleanly. 2. I don't like promote_ioerr_to_panic() very much, partly because the same pattern gets repeated over and over, and partly because it would be awkwardly-named if we discovered that another 2 or 3 errors needed similar handling (or some other variant handling). I suggest instead having a function like report_critical_fsync_failure(char *path) that does something like this: int elevel = ERROR; if (errno == EIO) elevel = PANIC; ereport(elevel, (errcode_for_file_access(), errmsg("could not fsync file \"%s\": %m", path); And similarly I'd add report_critical_close_failure. In some cases, this would remove wording variations (e.g. in twophase.c) but I think that's fine, and maybe an improvement, as discussed on another recent thread. 3. slru.c calls pg_fsync() but isn't changed by the patch. That looks wrong. 4. The comment changes in snapbuild.c interact with the TODO that immediately follows. I think more adjustment is needed here. 5. It seems odd that you adjusted the comment for pg_fsync_no_writethrough() but not pg_fsync_writethrough() or pg_fsync(). Either pg_fsync_writethrough() doesn't have the same problem, in which case, awesome, but let's add a comment, or it does, in which case it should refer to the other one. And I think pg_fsync() itself needs a comment saying that every caller must be careful to use promote_ioerr_to_panic() or report_critical_fsync_failure() or whatever we end up calling it unless the fsync is not critical for data integrity. 6. In md.c, there's a stray blank line added. But more importantly, the code just above that looks like this: if (!FILE_POSSIBLY_DELETED(errno) || failures > 0) - ereport(ERROR, + ereport(promote_ioerr_to_panic(ERROR), (errcode_for_file_access(), errmsg("could not fsync file \"%s\": %m", path))); else ereport(DEBUG1, (errcode_for_file_access(), errmsg("could not fsync file \"%s\" but retrying: %m", path))); I might be all wet here, but it seems like if we enter the bottom branch, we still need the promote-to-panic behavior. 7. The comment adjustment for SyncDataDirectory mentions an "important" fact about fsync behavior, but then doesn't seem to change any logic on that basis. I think in general a number of these comments need a little more thought, but in this particular case, I think we also need to consider what the behavior should be (and the comment should reflect our considered judgement on that point, and the implementation should match). 8. Andres suggested to me off-list that we should have a GUC to disable the promote-to-panic behavior in case it turns out to be a show-stopper for some user. I think that's probably a good idea. Adding many new ways to PANIC in a minor release without providing any way to go back to the old behavior sounds unfriendly. Surely, anyone who suffers much from this has really serious other problems anyway, but all the same I think we should provide an escape hatch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: