pg_rewind test race condition..? - Mailing list pgsql-hackers
| From | Stephen Frost |
|---|---|
| Subject | pg_rewind test race condition..? |
| Date | |
| Msg-id | 20150428180253.GU30322@tamriel.snowman.net Whole thread Raw |
| Responses |
Re: pg_rewind test race condition..?
|
| List | pgsql-hackers |
Heikki,
Not sure if anyone else is seeing this, but I'm getting regression test failures when running the pg_rewind tests
prettyconsistently with 'make check'. Specifically with "basic remote", I'm getting:
source and target cluster are on the same timeline
Failure, exiting
in regress_log/pg_rewind_log_basic_remote.
If I throw a "sleep(5);" into t/001_basic.pl before the call to RewindTest::run_pg_rewind($test_mode); then everything
worksfine.
./configure --silent --prefix="$INSTALL" --with-openssl --with-tcl --with-tclconfig=/usr/lib/tcl8.6 --with-perl
--enable-debug--enable-cassert --enable-tap-tests --with-gssapi
Full unsuccessful pg_rewind_log_basic_remote output:
-----------------------------------------------------------------
waiting for server to start....LOG: database system was shut down at 2015-04-28 13:46:34 EDT
LOG: database system is ready to accept connectionsdone
server started
waiting for server to start....LOG: database system was interrupted; last known up at 2015-04-28 13:46:35 EDT
LOG: entering standby mode
LOG: redo starts at 0/2000028
LOG: consistent recovery state reached at 0/20000F8
LOG: database system is ready to accept read only connections
LOG: started streaming WAL from primary at 0/3000000 on timeline 1done
server started
server promoting
LOG: received promote request
FATAL: terminating walreceiver process due to administrator command
LOG: invalid record length at 0/30EB410
LOG: redo done at 0/30EB3A0
LOG: last completed transaction was at log time 2015-04-28 13:46:37.540952-04
LOG: selected new timeline ID: 2
LOG: archive recovery complete
LOG: database system is ready to accept connections
waiting for server to shut down....LOG: received fast shutdown request
LOG: aborting any active transactions
LOG: shutting down
LOG: database system is shut downdone
server stopped
source and target cluster are on the same timeline
Failure, exiting
waiting for server to start....LOG: database system was shut down at 2015-04-28 13:46:39 EDT
LOG: entering standby mode
LOG: consistent recovery state reached at 0/3311438
LOG: invalid record length at 0/3311438
LOG: database system is ready to accept read only connections
LOG: fetching timeline history file for timeline 2 from primary server
LOG: started streaming WAL from primary at 0/3000000 on timeline 1
LOG: replication terminated by primary server
DETAIL: End of WAL reached on timeline 1 at 0/30EB410.
LOG: new timeline 2 forked off current database system timeline 1 before current recovery point 0/3311438
LOG: restarted WAL streaming at 0/3000000 on timeline 1
LOG: replication terminated by primary server
DETAIL: End of WAL reached on timeline 1 at 0/30EB410.
LOG: new timeline 2 forked off current database system timeline 1 before current recovery point 0/3311438done
server started
LOG: received immediate shutdown request
LOG: received immediate shutdown request
-----------------------------------------------------------------
Full successful pg_rewind_log_basic_remote output:
-----------------------------------------------------------------
waiting for server to start....LOG: database system was shut down at 2015-04-28 13:54:04 EDT
LOG: database system is ready to accept connectionsdone
server started
waiting for server to start....LOG: database system was interrupted; last known up at 2015-04-28 13:54:05 EDT
LOG: entering standby mode
LOG: redo starts at 0/2000028
LOG: consistent recovery state reached at 0/20000F8
LOG: database system is ready to accept read only connections
LOG: started streaming WAL from primary at 0/3000000 on timeline 1done
server started
server promoting
LOG: received promote request
FATAL: terminating walreceiver process due to administrator command
LOG: invalid record length at 0/30EB410
LOG: redo done at 0/30EB3A0
LOG: last completed transaction was at log time 2015-04-28 13:54:07.547533-04
LOG: selected new timeline ID: 2
LOG: archive recovery complete
LOG: database system is ready to accept connections
waiting for server to shut down....LOG: received fast shutdown request
LOG: aborting any active transactions
LOG: shutting down
LOG: database system is shut downdone
server stopped
The servers diverged at WAL position 0/30EB410 on timeline 1.
Rewinding from last common checkpoint at 0/30EB3A0 on timeline 1
Done!
waiting for server to start....LOG: database system was interrupted while in recovery at log time 2015-04-28 13:54:08
EDT
HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier
recoverytarget.
LOG: entering standby mode
LOG: redo starts at 0/30EB368
LOG: consistent recovery state reached at 0/3113788
LOG: invalid record length at 0/3113788
LOG: database system is ready to accept read only connections
LOG: started streaming WAL from primary at 0/3000000 on timeline 2done
server started
LOG: received immediate shutdown request
LOG: unexpected EOF on standby connection
LOG: received immediate shutdown request
-----------------------------------------------------------------
Thanks!
Stephen
pgsql-hackers by date: