Re: Postgres, fsync, and OSs (specifically linux) - Mailing list pgsql-hackers
From | Michael Banck |
---|---|
Subject | Re: Postgres, fsync, and OSs (specifically linux) |
Date | |
Msg-id | 20180428153548.GA24854@nighthawk.caipicrew.dd-dns.de Whole thread Raw |
In response to | Re: Postgres, fsync, and OSs (specifically linux) (Stephen Frost <sfrost@snowman.net>) |
Responses |
Re: Postgres, fsync, and OSs (specifically linux)
|
List | pgsql-hackers |
Hi, On Sat, Apr 28, 2018 at 11:21:20AM -0400, Stephen Frost wrote: > * Craig Ringer (craig@2ndquadrant.com) wrote: > > On 28 April 2018 at 06:28, Andres Freund <andres@anarazel.de> wrote: > > > - Add a pre-checkpoint hook that checks for filesystem errors *after* > > > fsyncing all the files, but *before* logging the checkpoint completion > > > record. Operating systems, filesystems, etc. all log the error format > > > differently, but for larger installations it'd not be too hard to > > > write code that checks their specific configuration. > > > > I looked into using trace event file descriptors for this, btw, but > > we'd need CAP_SYS_ADMIN to create one that captured events for other > > processes. Plus filtering the events to find only events for the files > > / file systems of interest would be far from trivial. And I don't know > > what guarantees we have about when events are delivered. > > > > I'd love to be able to use inotify for this, but again, that'd only be > > a new-kernels thing since it'd need an inotify extension to report I/O > > errors. > > > > Presumably mostly this check would land up looking at dmesg. > > > > I'm not convinced it'd get widely deployed and widely used, or that > > it'd be used correctly when people tried to use it. Look at the > > hideous mess that most backup/standby creation scripts, > > archive_command scripts, etc are. > > Agree with more-or-less everything you've said here, but a big +1 on > this. If we do end up going down this route we have *got* to provide > scripts which we know work and have been tested and are well maintained > on the popular OS's for the popular filesystems and make it clear that > we've tested those and not others. We definitely shouldn't put > something in our docs that is effectively an example of the interface > but not an actual command that anyone should be using. This dmesg-checking has been mentioned several times now, but IME enterprise distributions (or server ops teams?) seem to tighten access to dmesg and /var/log to non-root users, including postgres. Well, or just vanilla Debian stable apparently: postgres@fock:~$ dmesg dmesg: read kernel buffer failed: Operation not permitted Is it really a useful expectation that the postgres user will be able to trawl system logs for I/O errors? Or are we expecting the sysadmins (in case they are distinct from the DBAs) to setup sudo and/or relax permissions for this everywhere? We should document this requirement properly at least then. The netlink thing from Google that Tet Ts'O mentioned would probably work around that, but if that is opened up it would not be deployed anytime soon either. Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.banck@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
pgsql-hackers by date: