Re: pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions) - Mailing list pgsql-hackers
From | Alex Ignatov |
---|---|
Subject | Re: pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions) |
Date | |
Msg-id | 5735E974.3070903@postgrespro.ru Whole thread Raw |
In response to | pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions) (Michael Paquier <michael.paquier@gmail.com>) |
Responses |
Re: pg_basebackup, pg_receivexlog and data durability (was:
silent data loss with ext4 / all current versions)
|
List | pgsql-hackers |
On 13.05.2016 9:39, Michael Paquier wrote: > Hi all, > > Beginning a new thread because the ext4 issues are closed, and because > pg_basebackup data durability meritates a new thread. And in short > about the problem: pg_basebackup makes no effort in being sure that > the data it backs up is on disk, which is bad... One possible > recommendation is to use initdb -S after running pg_basebackup, but > making sure that data is on disk should be done before pg_basebackup > ends. > > On Thu, May 12, 2016 at 8:09 PM, I wrote: >> And actually this won't fly high if there is no equivalent of >> walkdir() or if the fsync()'s are not applied recursively. On master >> at least the refactoring had better be done cleanly first... For the >> back branches, we could just have some recursive call like >> fsync_recursively and keep that in src/bin/pg_basebackup. Andres, do >> you think that this should be part of fe_utils or src/common/? I'd >> tend to think the latter is more adapted as there is an equivalent in >> the backend. On back-branches, we could just have something like >> fsync_recursively that walks though the paths. An even more simple >> approach would be to fsync() individually things that have been >> written, but that would suck in performance. > > So, attached are two patches that apply on HEAD to address the problem > of pg_basebackup that does not sync the data it writes. As > pg_basebackup cannot use directly initdb -S because, as a client-side > utility, it may be installed while initdb is not (see Fedora and > RHEL), I have refactored the code so as the routines in initdb.c doing > the fsync of PGDATA and other fsync stuff are in src/fe_utils/, and > this is 0001. > > Patch 0002 is a set of fixes for pg_basebackup: > - In plain mode, fsync_pgdata is used so as all the tablespaces are > fsync'd at once. This takes care as well of the case where pg_xlog is > a symlink. > - In tar mode (no stdout), each tar file is synced individually, and > the base directory is synced once at the end. > In both cases, failures are not considered fatal. > > With pg_basebackup -X and pg_receivexlog, the manipulation of WAL > files is made durable by using fsync and durable_rename where needed > (credits to Andres mainly for this part). > > This set of patches is aimed only at HEAD. Back-patchable versions of > this patch would need to copy fsync_pgdata and friends into > streamutil.c for example. > > I am adding that to the next CF for review as a bug fix. > Regards, > > > > Hi! Do we have any confidence that data file is not being corrupted? I.e contains some corrupted page? Can pg_basebackup check page checksum (db init with initdb -k) while backing up files? Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
pgsql-hackers by date: