Thread: [BUGS] pg_rewind fails after failover, 'invalid record length'

[BUGS] pg_rewind fails after failover, 'invalid record length'

From

Stuart Bishop

Date:

15 February 2017, 13:02:35

I have a test case with 3 PostgreSQL 9.5.5 servers, one master and two
hot standbys using standard streaming replication from the master.
wal_log_hints is not enabled, but all systems initialized to use
checksums.

The system is idle. I tear down the master, leaving the two standbys
orphaned at the same point in timeline 1.

I promote one of the standbys to master, switching it to timeline 2. I
shutdown the other standby, and attempt to run pg_rewind. It fails:

$ /usr/lib/postgresql/9.5/bin/pg_rewind
--target-pgdata=/var/lib/postgresql/9.5/main
--source-server='dbname=postgres host=10.0.4.212 port=5432
user=_juju_repl'
servers diverged at WAL position 0/5000AE0 on timeline 1

could not find previous WAL record at 0/5000AE0: invalid record length
at 0/5000AE0
Failure, exiting

This is what the pg_xlog on the new master looked like at that point:

postgres@juju-4ead0d-11:~/9.5/main/pg_xlog$ ls -al
total 81993
drwx------  3 postgres postgres        9 Feb 15 08:55 .
drwx------ 19 postgres postgres       25 Feb 15 08:55 ..
-rw-------  1 postgres postgres 16777216 Feb 15 07:52 000000010000000000000002
-rw-------  1 postgres postgres 16777216 Feb 15 07:52 000000010000000000000003
-rw-------  1 postgres postgres 16777216 Feb 15 07:52 000000010000000000000004
-rw-------  1 postgres postgres 16777216 Feb 15 08:52
000000010000000000000005.partial
-rw-------  1 postgres postgres       41 Feb 15 08:55 00000002.history
-rw-------  1 postgres postgres 16777216 Feb 15 09:15 000000020000000000000005
drwx------  2 postgres postgres        6 Feb 15 08:55 archive_status

Reconfiguring the standby to replicate from the new master and
restarting it works fine. The standby happily replicates and switches
to the new timeline. I can shut this standby down and run pg_rewind
again and it works fine.


-- 
Stuart Bishop <stuart@stuartbishop.net>
http://www.stuartbishop.net/


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] pg_rewind fails after failover, 'invalid record length'

From

Michael Paquier

Date:

16 February 2017, 05:58:19

On Wed, Feb 15, 2017 at 7:02 PM, Stuart Bishop <stuart@stuartbishop.net> wrote:
> I have a test case with 3 PostgreSQL 9.5.5 servers, one master and two
> hot standbys using standard streaming replication from the master.
> wal_log_hints is not enabled, but all systems initialized to use
> checksums.

The version of pg_rewind in Postgres 9.6 is able to handle timeline
switches, which allows far more flexibility, not the one of 9.5. If
the standby that has been promoted was the most advanced one, there is
actually no need to run pg_rewind on the second standby.
-- 
Michael


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] pg_rewind fails after failover, 'invalid record length'

From

Stuart Bishop

Date:

16 February 2017, 11:50:48

On 16 February 2017 at 09:58, Michael Paquier <michael.paquier@gmail.com> wrote:
> On Wed, Feb 15, 2017 at 7:02 PM, Stuart Bishop <stuart@stuartbishop.net> wrote:
>> I have a test case with 3 PostgreSQL 9.5.5 servers, one master and two
>> hot standbys using standard streaming replication from the master.
>> wal_log_hints is not enabled, but all systems initialized to use
>> checksums.
>
> The version of pg_rewind in Postgres 9.6 is able to handle timeline
> switches, which allows far more flexibility, not the one of 9.5. If
> the standby that has been promoted was the most advanced one, there is
> actually no need to run pg_rewind on the second standby.

Hmm. Ok.

This is for automation, and I was hoping to cover the race condition
where one or more of the standbys is still able to replicate from the
doomed master at the time of promotion (which unit is more advanced
might have changed between the time I measure the timelines and
restart the remaining standbys pointing to the new master). I think
this means I need a second round of restarts, restarting all the
standbys with no primary_conninfo or restore_command and then making
the measurement on which is the most advanced node. Or is it enough to
pg_xlog_replay_pause(), and it doesn't matter if a standby receives
more logs from the doomed master if it doesn't replay them?

(Or not do the promote step at all, just restarting one of the
standbys as master with no timeline switch. But if the doomed master
is still able to ship its WAL files it could corrupt my backups and be
a worse problem)

-- 
Stuart Bishop <stuart@stuartbishop.net>
http://www.stuartbishop.net/

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs