Re: BUG #15638: pg_basebackup with --wal-method=stream incorrectlygenerates WAL segment created during backup - Mailing list pgsql-bugs
From | Andre Piwoni |
---|---|
Subject | Re: BUG #15638: pg_basebackup with --wal-method=stream incorrectlygenerates WAL segment created during backup |
Date | |
Msg-id | CAEC-+VHa5zh6_p6xEyj2p2WjpdEkFw0pUK4V9AyY4JKL8MvoGw@mail.gmail.com Whole thread Raw |
In response to | Re: BUG #15638: pg_basebackup with --wal-method=stream incorrectlygenerates WAL segment created during backup (Maksim Milyutin <milyutinma@gmail.com>) |
Responses |
Re: BUG #15638: pg_basebackup with --wal-method=stream incorrectlygenerates WAL segment created during backup
Re: BUG #15638: pg_basebackup with --wal-method=stream incorrectlygenerates WAL segment created during backup |
List | pgsql-bugs |
On 2/16/19 6:25 PM, Michael Paquier wrote:
On Sat, Feb 16, 2019 at 12:26:13AM +0000, PG Bug reporting form wrote:When new slave is created by taking base backup from the primary using pg_basebackup with --wal-method=stream option the WAL file generated during the backup is different (as compared with diff or cmp command) than that on the master and in WAL archive directory. Furthermore, this file does not exist in pg_wal/archive_status with .done extension on new slave, though it exists in pg_wal directory, resulting in failed attempt to archive this file when slave node is promoted as master node. 2019-02-15 14:15:58.872 PST [5369] DETAIL: The failed archive command was: test ! -f /mnt/pgsql/archive/000000010000000000000002 && cp pg_wal/000000010000000000000002 /mnt/pgsql/archive/000000010000000000000002How do you promote your standby? In Postgres 10, the last, partial WAL segment of a past timeline generated at promotion is renamed .partial to avoid any conflicts, so as this should normally not happen if you do not use archive_mode = always.
This issue concerns only segment that contains wal records from pg_basebackup. For example, I have reproduced this issue on master branch getting the following content of archive directory after standby promoting:
000000010000000000000001.node1
000000010000000000000002.00000028.backup.node1
000000010000000000000002.node1
000000010000000000000002.node2
000000010000000000000003.node1
000000010000000000000004.node1
000000010000000000000005.partial.node2
000000020000000000000005.node2
00000002.history.node2Each wal segment specifically was appended with .<node> suffix to distinguish its source. Initially the node1 was master, node2 - standby. The segment 000000010000000000000002 has the following content:
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/02000028, prev 0/015DB060, desc: RUNNING_XACTS nextXid 474 latestCompletedXid 473 oldestRunningXid 474
rmgr: XLOG len (rec/tot): 106/ 106, tx: 0, lsn: 0/02000060, prev 0/02000028, desc: CHECKPOINT_ONLINE redo 0/2000028; tli 1; prev tli 1; fpw true; xid 0:474; oid 12690; multi 1; offset 0; oldest xid 467 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 474; online
rmgr: XLOG len (rec/tot): 34/ 34, tx: 0, lsn: 0/020000D0, prev 0/02000060, desc: BACKUP_END 0/2000028
rmgr: XLOG len (rec/tot): 24/ 24, tx: 0, lsn: 0/020000F8, prev 0/020000D0, desc: SWITCH
On the current state of master branch the content of two segments are the same:
$ md5sum 000000010000000000000002.node1 000000010000000000000002.node2
252e3e8cf3a85f218bb467005c565e3c 000000010000000000000002.node1
252e3e8cf3a85f218bb467005c565e3c 000000010000000000000002.node2But on PG 10.6 the contents are differentiated on the tails:
$ cmp 000000010000000000000002.node1 000000010000000000000002.node2
000000010000000000000002.node1 000000010000000000000002.node2 differ: byte 131073, line 1
while logical part (prefix) are the same - the lsn of last record (SWITCH) in this segment is 0/02000130 and have length 24 bytes.
Is it correct behavior? What would be the correct canonical representation of archive command taking into account this issue?
-- Regards, Maksim Milyutin
Andre Piwoni
Sr. Software Developer, BI/Database
WebMD Health Services
Mobile: 801.541.4722
pgsql-bugs by date:
Previous
From: Tom LaneDate:
Subject: Re: BUG #15572: Misleading message reported by "Drop function operation" on DB with functions having same name
Next
From: Julien RouhaudDate:
Subject: Re: BUG #15572: Misleading message reported by "Drop functionoperation" on DB with functions having same name