pgsql: Fix timing-dependent failure in recovery test 004_timeline_switc - Mailing list pgsql-committers

From Michael Paquier
Subject pgsql: Fix timing-dependent failure in recovery test 004_timeline_switc
Date
Msg-id E1vGYGC-0057PL-0u@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Fix timing-dependent failure in recovery test 004_timeline_switch

The test introduced by 17b2d5ec759c verifies that a WAL receiver
survives across a timeline jump by searching the server logs for
termination messages.  However, it called restart() before the timeline
switch, which kills the WAL receiver and may log the exact message being
checked, hence failing the test.  As TAP tests reuse the same log file
across restarts, a rotate_logfile() is used before the restart so as the
log matching check is not impacted by log entries generated by a
previous shutdown.

Recent changes to file handle inheritance altered I/O timing enough to
make this fail consistently while testing another patch.

While on it, this adds an extra check based on a PID comparison.  This
test may lead to false positives as it could be possible that the WAL
receiver has processed a timeline jump before the initial PID is
grabbed, but it should be good enough in most cases.

Like 17b2d5ec759c, backpatch down to v13.

Author: Bryan Green <dbryan.green@gmail.com>
Co-authored-by: Xuneng Zhou <xunengzhou@gmail.com>
Discussion: https://postgr.es/m/9d00b597-d64a-4f1e-802e-90f9dc394c70@gmail.com
Backpatch-through: 13

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/a4fd971c6f534aff96a1b3aab61d8a498b6b4ac5

Modified Files
--------------
src/test/recovery/t/004_timeline_switch.pl | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)


pgsql-committers by date:

Previous
From: Amit Kapila
Date:
Subject: pgsql: Add sequence synchronization for logical replication.
Next
From: Richard Guo
Date:
Subject: pgsql: Avoid creating duplicate ordered append paths