Hi All,
Attaching patch to support a new feature that let pg_waldump decode
WAL files directly from a tar archive. This was worked to address a
limitation in pg_verifybackup[1], which couldn't parse WAL files from
tar-formatted backups.
The implementation will align with pg_waldump's existing xlogreader
design, which uses three callback functions to manage WAL segments:
open, read, and close. For tar archives, however, the approach will be
simpler. Instead of using separate callbacks for opening and closing,
the tar archive will be opened once at the start and closed explicitly
at the end.
The core logic will be in the WAL page reading callback. When
xlogreader requests a new WAL page, this callback will be invoked. It
will then call the archive streamer routine to read the WAL data from
the tar archive into a buffer. This data will then be copied into
xlogreader's own buffer, completing the read.
Essentially, this is plumbing work: the new code will be responsible
for getting WAL data from the tar archive and feeding it to the
existing xlogreader. All other WAL page and record decoding logic,
which is already robust within xlogreader, will be reused as is.
This feature is being implemented in a series of patches as:
- Refactoring: The first few patches (0001-0004) are dedicated to
refactoring and minor code changes.
- 005: This patch introduces the core functionality for pg_waldump to
read WAL from a tar archive using the same archive streamer
(fe_utils/astreamer.h) used in pg_verifybackup. This version requires
WAL files in the archive to be in sequential order.
- 006: This patch removes the sequential order restriction. If
pg_waldump encounters an out-of-order WAL file, it writes the file to
a temporary directory. The utility will then continue decoding and
read from this temporary location later.
- 007 and onwards: These patches will update pg_verifybackup to remove the
restriction on WAL parsing for tar-formatted backups. 008 patch renames the
"--wal-directory" switch to "--wal-path" to make it more generic, allowing
it accepts a directory path or a tar archive path.
-----------------------------------
Known Issues & Status:
-----------------------------------
- Timeline Switching: The current implementation in patch 006 does not
correctly handle timeline switching. This is a known issue, especially
when a timeline change occurs on a WAL file that has been written to a
temporary location.
- Testing: Local regression tests on CentOS and macOS M4 are passing.
However, some tests on macOS Sonoma (specifically 008_untar.pl and
010_client_untar.pl) are failing in the GitHub workflow with a "WAL
parsing failed for timeline 1" error. This issue is currently being
investigated.
Please take a look at the attached patch and let me know your
thoughts. This is an initial version, and I am making incremental
improvements to address known issues and limitations.
1] https://git.postgresql.org/pg/commitdiff/8dfd3129027969fdd2d9d294220c867d2efd84aa
--
Regards,
Amul Sul
EDB: http://www.enterprisedb.com