Re: Recovery of .partial WAL segments - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: Recovery of .partial WAL segments
Date
Msg-id CAEze2Wi30VZDKxaR0wRGNx7LJ=heiwPpALhLFTu2Lg86QanCXA@mail.gmail.com
Whole thread Raw
In response to [MASSMAIL]Recovery of .partial WAL segments  (Stefan Fercot <stefan.fercot@protonmail.com>)
List pgsql-hackers
On Fri, 5 Apr 2024 at 11:45, Stefan Fercot <stefan.fercot@protonmail.com> wrote:
>
> Dear hackers,
>
> Generating a ".partial" WAL segment is pretty common nowadays (using pg_receivewal or during standby promotion).
> However, we currently don't do anything with it unless the user manually removes that ".partial" extension.
>
> The 028_pitr_timelines tests are highlighting that fact: with test data being being in 000000020000000000000003 and
000000010000000000000003.partial,a recovery following the latest timeline (2) will succeed but fail if we follow the
currenttimeline (1). 
>
> By simply trying to fetch the ".partial" file in XLogFileRead, we can easily recover more data and also cover that
(currenttimeline) recovery case. 
>
> So, this proposed patch makes XLogFileRead try to restore ".partial" WAL archives and adds a test to
028_pitr_timelinesusing current recovery_target_timeline. 

Does this path only get hit when we don't already have any WAL
segments (or partial segments) left for that timeline? I'm a bit
worried about overwriting existing (partial) segments that may have
more WAL than what we can get from archives.

(patch v2)
> +            restoredArchivedFile = !RestoreArchivedFile(path, xlogfname,
> +                                                        "RECOVERYXLOG",
> +                                                        wal_segment_size,
> +                                                        InRedo) &&
> +                !RestoreArchivedFile(path, partialxlogfname,
>                                      "RECOVERYXLOG",
>                                      wal_segment_size,
> -                                     InRedo))
> +                                     InRedo);

The value of restoredArchiveFile is inverted with what it indicates:
It is true when we failed to restore an archived xlog segment, and
false if we did succeed.

I'm also not a fan of the additional allocation of partialxlogfname in
this code. It could well do without, by "just" reusing the xlogfname
scratch space when we fail to recover the full segment.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)



pgsql-hackers by date:

Previous
From: jian he
Date:
Subject: Re: New "raw" COPY format
Next
From: Tom Lane
Date:
Subject: Re: Incorrect comment on pg_shadow view