RE: BUG #16922: In cascading replication, a standby server aborted when an upstream standby server promoted - Mailing list pgsql-bugs
From | egashira.yusuke@fujitsu.com |
---|---|
Subject | RE: BUG #16922: In cascading replication, a standby server aborted when an upstream standby server promoted |
Date | |
Msg-id | OSBPR01MB502909B028B39670554CD9A2FF629@OSBPR01MB5029.jpnprd01.prod.outlook.com Whole thread Raw |
Responses |
Re: BUG #16922: In cascading replication, a standby server aborted when an upstream standby server promoted
|
List | pgsql-bugs |
> The replication between "NODE-A" and "NODE-B" is synchronous replication, > and between "NODE-B" and "NODE-C" is asynchronous. > > "NODE-A" <-[synchronous]-> "NODE-B" <-[non-synchronous]-> "NODE-C" > > When the primary server "NODE-A" crashed due to full WAL storage and > "NODE-B" promoted, the downstream standby server "NODE-C" aborted with > following messages. > > 2021-03-11 11:26:28.470 JST [85228] LOG: invalid contrecord length 26 at > 0/5FFFFF0 > 2021-03-11 11:26:28.470 JST [85232] FATAL: terminating walreceiver process > due to administrator command > 2021-03-11 11:26:28.470 JST [85228] PANIC: could not open file > "pg_wal/000000020000000000000005": No such file or directory > 2021-03-11 11:26:28.492 JST [85260] LOG: started streaming WAL from primary > at 0/5000000 on timeline 2 > 2021-03-11 11:26:29.260 JST [85227] LOG: startup process (PID 85228) was > terminated by signal 6: Aborted I would like to clarify the conditions under which this "abort" occurred to explain to the customer. By the result of pg_waldump, I think that the conditions are followings. 1) A partially written (across the following segment files) WAL record is recorded at the end of the WAL segment file. and 2) The WAL segment file of 1) is the last WAL segment file that standby server received, and 3) The standby server promoted. I think that the above conditions will be met only when the primary server crashed due to full WAL storage. Is my idea correct? Regards. Yusuke, Egahsira. -----Original Message----- From: PG Bug reporting form <noreply@postgresql.org> Sent: Thursday, March 11, 2021 5:06 PM To: pgsql-bugs@lists.postgresql.org Cc: Egashira, Yusuke/江頭 勇佑 <egashira.yusuke@fujitsu.com> Subject: BUG #16922: In cascading replication, a standby server aborted when an upstream standby server promoted The following bug has been logged on the website: Bug reference: 16922 Logged by: yusuke egashira Email address: egashira.yusuke@fujitsu.com PostgreSQL version: 13.2 Operating system: Red Hat Enterprise Linux 8 Description: Hi. My customer uses PostgreSQL with a cascading replication. The replication between "NODE-A" and "NODE-B" is synchronous replication, and between "NODE-B" and "NODE-C" is asynchronous. "NODE-A" <-[synchronous]-> "NODE-B" <-[non-synchronous]-> "NODE-C" When the primary server "NODE-A" crashed due to full WAL storage and "NODE-B" promoted, the downstream standby server "NODE-C" aborted with following messages. 2021-03-11 11:26:28.470 JST [85228] LOG: invalid contrecord length 26 at 0/5FFFFF0 2021-03-11 11:26:28.470 JST [85232] FATAL: terminating walreceiver process due to administrator command 2021-03-11 11:26:28.470 JST [85228] PANIC: could not open file "pg_wal/000000020000000000000005": No such file or directory 2021-03-11 11:26:28.492 JST [85260] LOG: started streaming WAL from primary at 0/5000000 on timeline 2 2021-03-11 11:26:29.260 JST [85227] LOG: startup process (PID 85228) was terminated by signal 6: Aborted 2021-03-11 11:26:29.260 JST [85227] LOG: terminating any other active server processes 2021-03-11 11:26:29.290 JST [85227] LOG: database system is shut down I found this behavior in PostgreSQL 13.2, 12.6, 11.11, 10.16, 9.6.21 and 9.5.24. The reproduce procedure in PostgreSQL 13.2 on RHEL8 is: =========================== # Create temporary wal data device. mkdir -p /tmp/wal dd if=/dev/zero of=/tmp/waldata bs=100MB count=1 sudo mkfs -t xfs /tmp/waldata sudo mount -t xfs -o loop /tmp/waldata /tmp/wal sudo chown postgres:postgres /tmp/wal # mkdir -p /tmp/wal/pg_wal # Create primary NODE-A. initdb -D /tmp/nodea -X /tmp/wal/pg_wal cat >>/tmp/nodea/postgresql.conf <<_EOF_ port=5432 synchronous_standby_names = 'nodeb' restart_after_crash = off wal_keep_size = 10GB _EOF_ pg_ctl -D /tmp/nodea start -w -l /tmp/nodea_logfile # Craete standby NODE-B. pg_basebackup -D /tmp/nodeb -R -X fetch -d "postgresql:///:5432?application_name=nodeb" echo "port = 5433" >> /tmp/nodeb/postgresql.conf pg_ctl -D /tmp/nodeb start -w -l /tmp/nodeb_logfile # Create standby NODE-C pg_basebackup -D /tmp/nodec -p 5433 -R -X fetch -d "postgresql:///:5433?application_name=nodec" echo "port = 5434" >> /tmp/nodec/postgresql.conf pg_ctl -D /tmp/nodec start -w # Execute SQLs for WAL full. createdb -p 5432 cat <<_psql | psql -p 5432 create table test (a int, b text); insert into test values (generate_series(1,10000000), 'abcdefghijk'); _psql # ** After down primary NODE-A ** pg_ctl -D /tmp/nodeb promote # In this timing, the NODE-C will down. =========================== The stack trace of coredump with debug build is following: # This stack trace was taken in a different environment than the other attached logs. #0 0x00007f73e3f1f8df in raise () from /lib64/libc.so.6 #1 0x00007f73e3f09cf5 in abort () from /lib64/libc.so.6 #2 0x0000000000878c83 in errfinish (filename=<optimized out>, filename@entry=0x8cc67e "xlog.c", lineno=lineno@entry=3741, funcname=funcname@entry=0x8d45c8 <__func__.28536> "XLogFileRead") at elog.c:592 #3 0x0000000000507730 in XLogFileRead (segno=<optimized out>, tli=2, source=source@entry=XLOG_FROM_STREAM, notfoundOk=notfoundOk@entry=false, emode=22) at xlog.c:3739 #4 0x0000000000509c62 in WaitForWALToBecomeAvailable (tliRecPtr=134217712, fetching_ckpt=<optimized out>, randAccess=false, RecPtr=117448704) at xlog.c:12411 #5 XLogPageRead (xlogreader=0x282b478, targetPagePtr=<optimized out>, reqLen=8192, targetRecPtr=134217712, readBuf=0x2858af8 "\006\321\a") at xlog.c:11908 #6 0x0000000000517e25 in ReadPageInternal (state=state@entry=0x282b478, pageptr=pageptr@entry=134209536, reqLen=reqLen@entry=8192) at xlogreader.c:608 #7 0x000000000051886a in XLogReadRecord (state=state@entry=0x282b478, errormsg=errormsg@entry=0x7ffd148db3f8) at xlogreader.c:329 #8 0x000000000050797e in ReadRecord (xlogreader=xlogreader@entry=0x282b478, emode=emode@entry=15, fetching_ckpt=fetching_ckpt@entry=false) at xlog.c:4346 #9 0x000000000050e205 in ReadRecord (fetching_ckpt=false, emode=15, xlogreader=0x282b478) at xlog.c:4340 #10 StartupXLOG () at xlog.c:7385 #11 0x00000000006f72d1 in StartupProcessMain () at startup.c:204 #12 0x000000000051cf5b in AuxiliaryProcessMain (argc=argc@entry=2, argv=argv@entry=0x7ffd148df240) at bootstrap.c:443 #13 0x00000000006f343f in StartChildProcess (type=StartupProcess) at postmaster.c:5492 #14 0x00000000006f6938 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x282acc0) at postmaster.c:1404 #15 0x000000000047ecd9 in main (argc=3, argv=0x282acc0) at main.c:210 I also attach the debug5 level logs of NODE-C. 2021-03-11 11:06:39.136 JST [84342] LOG: replication terminated by primary server 2021-03-11 11:06:39.136 JST [84342] DETAIL: End of WAL reached on timeline 1 at 0/6000000. 2021-03-11 11:06:39.140 JST [84342] LOG: fetching timeline history file for timeline 2 from primary server 2021-03-11 11:06:39.148 JST [84342] DEBUG: walreceiver ended streaming and awaits new instructions 2021-03-11 11:06:39.148 JST [84338] LOG: new target timeline is 2 2021-03-11 11:06:39.148 JST [84338] DEBUG: switched WAL source from stream to archive after failure 2021-03-11 11:06:39.148 JST [84338] DEBUG: could not open file "pg_wal/000000020000000000000006": No such file or directory 2021-03-11 11:06:39.148 JST [84338] DEBUG: switched WAL source from archive to stream after failure 2021-03-11 11:06:39.148 JST [84342] LOG: restarted WAL streaming at 0/6000000 on timeline 2 2021-03-11 11:06:39.148 JST [84342] DEBUG: sendtime 2021-03-11 11:06:39.148674+09 receipttime 2021-03-11 11:06:39.148723+09 replication apply delay 17910 ms transfer latency 1 ms 2021-03-11 11:06:39.148 JST [84342] DEBUG: creating and filling new WAL file 2021-03-11 11:06:39.240 JST [84342] DEBUG: done creating and filling new WAL file 2021-03-11 11:06:39.240 JST [84342] DEBUG: sending write 0/6000048 flush 0/5FFFFF0 apply 0/5FFFFF0 2021-03-11 11:06:39.242 JST [84342] DEBUG: sending write 0/6000048 flush 0/6000048 apply 0/5FFFFF0 2021-03-11 11:06:39.486 JST [84342] DEBUG: sendtime 2021-03-11 11:06:39.486387+09 receipttime 2021-03-11 11:06:39.486449+09 replication apply delay 18248 ms transfer latency 1 ms 2021-03-11 11:06:39.486 JST [84342] DEBUG: sending write 0/6000080 flush 0/6000048 apply 0/5FFFFF0 2021-03-11 11:06:39.486 JST [84342] DEBUG: sending write 0/6000080 flush 0/6000080 apply 0/5FFFFF0 2021-03-11 11:06:39.489 JST [84338] LOG: invalid contrecord length 26 at 0/5FFFFF0 2021-03-11 11:06:39.489 JST [84342] FATAL: terminating walreceiver process due to administrator command 2021-03-11 11:06:39.489 JST [84342] DEBUG: shmem_exit(1): 1 before_shmem_exit callbacks to make 2021-03-11 11:06:39.489 JST [84342] DEBUG: shmem_exit(1): 6 on_shmem_exit callbacks to make 2021-03-11 11:06:39.489 JST [84342] DEBUG: proc_exit(1): 2 callbacks to make 2021-03-11 11:06:39.489 JST [84342] DEBUG: exit(1) 2021-03-11 11:06:39.489 JST [84342] DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make 2021-03-11 11:06:39.489 JST [84342] DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make 2021-03-11 11:06:39.489 JST [84342] DEBUG: proc_exit(-1): 0 callbacks to make 2021-03-11 11:06:39.489 JST [84338] DEBUG: switched WAL source from stream to archive after failure 2021-03-11 11:06:39.489 JST [84338] DEBUG: could not open file "pg_wal/000000020000000000000005": No such file or directory 2021-03-11 11:06:39.489 JST [84338] DEBUG: switched WAL source from archive to stream after failure 2021-03-11 11:06:39.489 JST [84338] PANIC: could not open file "pg_wal/000000020000000000000005": No such file or directory 2021-03-11 11:06:39.501 JST [84337] DEBUG: reaping dead processes 2021-03-11 11:06:39.513 JST [84367] DEBUG: find_in_dynamic_libpath: trying "/home/egashira/pgsql/lib/libpqwalreceiver" 2021-03-11 11:06:39.513 JST [84367] DEBUG: find_in_dynamic_libpath: trying "/home/egashira/pgsql/lib/libpqwalreceiver.so" 2021-03-11 11:06:39.541 JST [84367] LOG: started streaming WAL from primary at 0/5000000 on timeline 2 2021-03-11 11:06:39.542 JST [84367] DEBUG: sendtime 2021-03-11 11:06:39.536374+09 receipttime 2021-03-11 11:06:39.54219+09 replication apply delay 56 ms transfer latency 6 ms 2021-03-11 11:06:39.542 JST [84367] DEBUG: creating and filling new WAL file 2021-03-11 11:06:39.696 JST [84367] DEBUG: done creating and filling new WAL file In addition, this issue may not occur depending on the timing. When this issue didn't occur, the last WAL record of NODE-A was not transferred to NODE-B. I attach the result of pg_waldump for NODE-A, NODE-B and NODE-C. [Reproduced] $ pg_waldump nodea/pg_wal/000000010000000000000005 | tail -n 4 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05FFFED0, prev 0/05FFFE88, desc: INSERT off 181 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3759 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05FFFF18, prev 0/05FFFED0, desc: INSERT off 182 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3759 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05FFFF60, prev 0/05FFFF18, desc: INSERT off 183 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3759 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05FFFFA8, prev 0/05FFFF60, desc: INSERT off 184 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3759 $ pg_waldump nodeb/pg_wal/000000010000000000000005 | tail -n 4 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05FFFED0, prev 0/05FFFE88, desc: INSERT off 181 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3759 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05FFFF18, prev 0/05FFFED0, desc: INSERT off 182 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3759 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05FFFF60, prev 0/05FFFF18, desc: INSERT off 183 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3759 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05FFFFA8, prev 0/05FFFF60, desc: INSERT off 184 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3759 $ pg_waldump nodec/pg_wal/000000010000000000000005 | tail -n 4 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05FFFED0, prev 0/05FFFE88, desc: INSERT off 181 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3759 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05FFFF18, prev 0/05FFFED0, desc: INSERT off 182 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3759 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05FFFF60, prev 0/05FFFF18, desc: INSERT off 183 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3759 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05FFFFA8, prev 0/05FFFF60, desc: INSERT off 184 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3759 [Not reproduced] $ pg_waldump nodea/pg_wal/000000010000000000000005 | tail -n 4 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05FFFED0, prev 0/05FFFE88, desc: INSERT off 181 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3759 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05FFFF18, prev 0/05FFFED0, desc: INSERT off 182 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3759 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05FFFF60, prev 0/05FFFF18, desc: INSERT off 183 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3759 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05FFFFA8, prev 0/05FFFF60, desc: INSERT off 184 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3759 $ pg_waldump nodeb/pg_wal/000000010000000000000005 | tail -n 4 pg_waldump: fatal: error in WAL record at 0/5BFFF88: invalid magic number 0000 in log segment 000000010000000000000005, offset 12582912 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05BFFEB0, prev 0/05BFFE68, desc: INSERT off 2 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3446 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05BFFEF8, prev 0/05BFFEB0, desc: INSERT off 3 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3446 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05BFFF40, prev 0/05BFFEF8, desc: INSERT off 4 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3446 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05BFFF88, prev 0/05BFFF40, desc: INSERT off 5 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3446 $ pg_waldump nodec/pg_wal/000000010000000000000005 | tail -n 4 pg_waldump: fatal: error in WAL record at 0/5BFFF88: invalid magic number 0000 in log segment 000000010000000000000005, offset 12582912 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05BFFEB0, prev 0/05BFFE68, desc: INSERT off 2 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3446 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05BFFEF8, prev 0/05BFFEB0, desc: INSERT off 3 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3446 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05BFFF40, prev 0/05BFFEF8, desc: INSERT off 4 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3446 rmgr: Heap len (rec/tot): 71/ 71, tx: 487, lsn: 0/05BFFF88, prev 0/05BFFF40, desc: INSERT off 5 flags 0x00, blkref #0: rel 1663/16384/16385 blk 3446
pgsql-bugs by date: