Re: Incorrect logic in XLogNeedsFlush() - Mailing list pgsql-hackers
From | Melanie Plageman |
---|---|
Subject | Re: Incorrect logic in XLogNeedsFlush() |
Date | |
Msg-id | CAAKRu_YrDdUfJms6yzVZV+vaQ3yqgSCOWDcMSSDr6_+TG5wCqQ@mail.gmail.com Whole thread Raw |
In response to | Re: Incorrect logic in XLogNeedsFlush() (Michael Paquier <michael@paquier.xyz>) |
List | pgsql-hackers |
On Thu, Sep 25, 2025 at 3:53 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Wed, Sep 24, 2025 at 05:33:08PM -0400, Melanie Plageman wrote: > > > And, if I'm not mistaken, there was another idea mentioned in [1] > > which was to move to using GetRecoveryState() in XLogNeedsFlush() and > > UpdateMinRecoveryPoint instead of relying on the variable InRecovery + > > XLogRecPtrIsInvalid(LocalMinRecoveryPoint) to distinguish crash > > recovery. > > The removal of InRecovery means that we would disable a bit more > aggressively updateMinRecoveryPoint for other processes than the > startup process as InRecovery is only set in the startup process, > which should be OK. Some comments of XLogNeedsFlush(), like the > comment block on top of the new GetRecoveryState() call, would need a > refresh. One thing I still don't understand is how catching up to min_recovery_point works when it is not in pg_wal. When the standby has to stream from the primary or fetch records from the archive to reach min_recovery point -- like if the value comes from a backup label -- then it seems like it is wrong to updateMinRecoveryPoint to true in SwitchIntoArchiveRecovery(). But ReadRecord() says "we have now replayed all the valid WAL in pg_wal, so we are presumably now consistent" and then it calls SwitchIntoArchiveRecovery(), which sets updateMinRecoveryPoint to true. > XLogNeedsFlush() has a bit further down your change a path where > updateMinRecoveryPoint is set to false for other processes than the > startup process. A shortcut based on GetRecoveryState() should make > its removal possible, simplifying a bit more the code. Going back to my earlier question about why processes other than the startup process are okay to read the ControlFile->minRecoveryPoint during crash recovery, Dilip said (and you later confirmed) WRT bgwriter: > Until the startup process hasn't completed the crash recovery i.g. not > applied the WAL unto 'ControlFile->minRecoveryPoint' logically > 'bgwriter' can not write any buffer whose WAL location is beyond the > ControlFile->minRecoveryPoint because we haven't yet applied those > WALs and also there should not any RestartPoint()/Checkpoint record > before reaching the ControlFile->minRecoveryPoint otherwise we would > have started crash recovery from that point. > > During the crash recovery process, before the WAL has been fully > applied up to the ControlFile->minRecoveryPoint, the bgwriter cannot > write any buffers that have a page LSN beyond this min recovery point. > This is because those WALs which are beyond min recovery point have > not yet been applied, so any buffer can not have page_lsn beyond this > point. But then why do we bother with doing the below in UpdateMinRecoveryPoint() and XLogNeedsFlush(): /* * Check minRecoveryPoint for any other process than the startup * process doing crash recovery, which should not update the control * file value if crash recovery is still running. */ if (XLogRecPtrIsInvalid(LocalMinRecoveryPoint)) updateMinRecoveryPoint = false; after reading from the ControlFile. All processes read the ControlFile on startup, so bgwriter would have something valid (if irrelevant) in LocalMinRecoveryPoint after reading the control file above. So, when would this code actually execute? For it to be okay to replace this: if (XLogRecPtrIsInvalid(LocalMinRecoveryPoint) && InRecovery) { updateMinRecoveryPoint = false; return false; } LocalMinRecoveryPoint = ControlFile->minRecoveryPoint; LocalMinRecoveryPointTLI = ControlFile->minRecoveryPointTLI; if (XLogRecPtrIsInvalid(LocalMinRecoveryPoint)) updateMinRecoveryPoint = false; with if (GetRecoveryState() == RECOVERY_STATE_CRASH) { updateMinRecoveryPoint = false; return false; } LocalMinRecoveryPoint = ControlFile->minRecoveryPoint; LocalMinRecoveryPointTLI = ControlFile->minRecoveryPointTLI; It seems like the second check of LocalMinRecoveryPoint after reading from the control file would never have been true. - Melanie
pgsql-hackers by date: