Re: BUG #14999: pg_rewind corrupts control file global/pg_control - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #14999: pg_rewind corrupts control file global/pg_control
Date
Msg-id 30255.1522711675@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #14999: pg_rewind corrupts control file global/pg_control  (Dmitry Dolgov <9erthalion6@gmail.com>)
Responses Re: BUG #14999: pg_rewind corrupts control file global/pg_control
List pgsql-bugs
Dmitry Dolgov <9erthalion6@gmail.com> writes:
> Ok, I agree. Then yes, this patch can be probably marked as ready.

I started to look over this patch, and soon decided that the
commentary in pg_rewind is impossibly bad :-(.  You need to study
libpq_executeFileMap for a long time before you are even sure which side
of the transfer it's executing on; the fact that it's doing a copy *TO*
the remote server is incredibly misleading, and the comments are not
anywhere near good enough to de-confuse somebody coming upon the code for
the first time.  Not to mention that the function's name conveys nothing
whatever.  Maybe it's not the job of this patch to fix that, but I can't
help thinking that poor design and documentation are a big part of why
this bug exists in the first place.

But enough of that rant.  Now that I've more or less wrapped my head
around what's happening, I think this is the core of the problem:
libpq_executeFileMap truncates every non-data file in the local data
directory before it's fetched anything at all from the remote server.
This seems to me to be utterly cavalier and brittle, because ANY FAILURE
AT ALL in the fetch sequence leaves you with a data directory that's been
trashed in whole or in part --- and that sequence could span a good long
time, if there's a lot to copy.  The originally complained-of problem
(that pg_control gets nuked) is just the tip of that iceberg, and I'm
afraid that the proposed patch is only removing a single potential
failure mode.

I think that a saner design would involve not truncating any particular
target file until we first get some data to write to it.  Christian's
original patch shown at
https://www.postgresql.org/message-id/CAB7nPqTyq3cjphJmoCPc0n8r90O4j1BpZC2BMaj8dKZ7g0-mtw%40mail.gmail.com
seems to be headed in that direction, in that it forces ordering of the
data fetch (which is likely a good idea anyway to ensure locality of
reference) and then performs the truncation when we get the first chunk
of data for a file.

Michael's proposal to have separate error handling for data and config
files is not bad in itself, and it does improve the behavior for one
foreseeable trouble case.  But I think we need the other thing as well,
or some variant of it, before we can regard pg_rewind as anything except
a chainsaw with the safety catch missing.

            regards, tom lane


pgsql-bugs by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: BUG #15112: Unable to run pg_upgrade with earthdistance extension
Next
From: PG Bug reporting form
Date:
Subject: BUG #15141: Faulty ISO 8601 parsing