Re: Incorrect logic in XLogNeedsFlush() - Mailing list pgsql-hackers
From | Dilip Kumar |
---|---|
Subject | Re: Incorrect logic in XLogNeedsFlush() |
Date | |
Msg-id | CAFiTN-tK9n4J=oGsRvYjVydNDarSa=5AgjOghjdhBqGSfiR3VQ@mail.gmail.com Whole thread Raw |
In response to | Re: Incorrect logic in XLogNeedsFlush() (Jeff Davis <pgsql@j-davis.com>) |
Responses |
Re: Incorrect logic in XLogNeedsFlush()
|
List | pgsql-hackers |
On Fri, Sep 12, 2025 at 10:02 PM Jeff Davis <pgsql@j-davis.com> wrote: > > On Fri, 2025-09-12 at 14:04 +0900, Michael Paquier wrote: > > On Fri, Sep 12, 2025 at 10:21:27AM +0530, Dilip Kumar wrote: > > > Yeah, asserting it at the end makes sense, as we can ensure that > > > XLogFlush() and XLogNeedsFlush() agree on the same thing without > > > adding additional overhead. Here is the revised one. > > > > Melanie and Jeff, what do you think? > > IIUC: during the end-of-recovery checkpoint, a hypothetical call to > XLogNeedsFlush() would have incorrectly used the LocalMinRecoveryPoint; > and with the change, it would correctly use the flush pointer. > > That wasn't a live bug because XLogNeedsFlush() was never actually > called during the end-of-recovery checkpoint. CreateCheckPoint() > called XLogFlush() directly, which correctly checked the flush pointer. > But, hypothetically, if that code had logic based around > XLogNeedsFlush() before calling XLogFlush(), that would have been a > bug. > > So this fix has no behavior change, but it would make the code more > resilient against changes to CreateCheckPoint(), more consistent, and > less confusing. Is that right? That's right. > Assuming I'm right so far, then I agree with changing the test in > XLogNeedsFlush() to !XLogInsertAllowed(), as the patch does. > > The comment in the patch is describing what's happening, but lost the > "why". Perhaps something like: > > "During recovery, we don't flush WAL but update minRecoveryPoint > instead. So "needs flush" is taken to mean whether minRecoveryPoint > would need to be updated. The end-of-recovery checkpoint still must > flush WAL like normal, though, so check !XLogInsertAllowed(). This > check must be consistent with XLogFlush()." Updated as per suggestion > The commit message is vague about whether there's a live bug or not; I > suggest clarifying that the inconsistency between the two functions > could have been a bug but wasn't. Also, I generally use past tense for > the stuff that's being fixed and present tense for what the patch does. I tried to improve it in v2 > One loose end: there was some discussion about the times when > LocalMinRecoveryPoint is valid/correct. I'm not sure I entirely > understood -- is that no longer a concern? IMHO during crash recovery LocalMinRecoveryPoint and ControlFile->minRecoveryPoint can not initialized until we replay all WAL so those should never get accessed. However if XLogNeedsFlush() were called during the end of the recovery checkpoint it was accessing this which was wrong although it was not a live bug and this patch is making that more resilient. OTOH if we are creating a restartpoint during standby replay or archive recovery then we do access LocalMinRecoveryPoint and ControlFile->minRecoveryPoint because by that time we have already done the minimum recovery require for reaching to the consistent state and we would have initialized ControlFile->minRecoveryPoint. IMHO during crash recovery, the variables LocalMinRecoveryPoint and ControlFile->minRecoveryPoint remain uninitialized until all required WAL has been replayed. Consequently, they must not be accessed during this phase. Previously, calling XLogNeedsFlush() during the "end of recovery checkpoint" incorrectly accessed these uninitialized values. While this was not a live bug, this patch hardens the code against this type of race condition. OTOH, when creating a RestartPoint during standby replay or doing archive recovery, access to both LocalMinRecoveryPoint and ControlFile->minRecoveryPoint is correct. By this stage, it has completed the minimum recovery required to achieve a consistent state, and ControlFile->minRecoveryPoint would have been properly initialized. -- Regards, Dilip Kumar Google
Attachment
pgsql-hackers by date: