Re: Hot standby, recovery infra - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: Hot standby, recovery infra |
Date | |
Msg-id | 49815C18.5040609@enterprisedb.com Whole thread Raw |
In response to | Re: Hot standby, recovery infra (Simon Riggs <simon@2ndQuadrant.com>) |
Responses |
Re: Hot standby, recovery infra
|
List | pgsql-hackers |
Simon Riggs wrote: > On Thu, 2009-01-29 at 12:18 +0900, Fujii Masao wrote: >> Though this is a matter of taste, I think that it's weird that bgwriter >> runs with ThisTimeLineID = 0 during recovery. This is because >> XLogCtl->ThisTimeLineID is set at the end of recovery. ISTM this will >> be a cause of bug in the near future, though this is harmless currently. > > It doesn't. That's exactly what InitXLogAccess() was for. It does *during recovery*, before InitXLogAccess is called. Yeah, it's harmless currently. It would be pretty hard to keep it up-to-date in bgwriter and other processes. I think it's better to keep it at 0, which is clearly an invalid value, than try to keep it up-to-date and risk using an old value. TimeLineID is not used in a lot of places, currently. It might be a good idea to add some "Assert(TimeLineID != 0)" to places where it used. >>> + /* >>> + * XXX: Should we try to perform restartpoints if we're not in consistent >>> + * recovery? The bgwriter isn't doing it for us in that case. >>> + */ >> I think yes. This is helpful for the system which it takes a long time to get >> a base backup, ie. it also takes a long time to reach a consistent recovery >> point. > > The original patch did this. Yeah, I took it out. ISTM if you restore from a base backup, you'd want to run it until consistent recovery anyway. We can put it back in if people feel it's helpful. There's two ways to do it: we can fire up the bgwriter before reaching consistent recovery point, or we can do the restartpoints in startup process before that point. I'm inclined to fire up the bgwriter earlier. That way, bgwriter remains responsible for all checkpoints and restartpoints, which seems clearer. I can't see any particular reason why bgwriter startup and reaching the consistent recovery point need to be linked together. >>> +CreateRestartPoint(int flags) >> <snip> >>> + * We rely on this lock to ensure that the startup process doesn't exit >>> + * Recovery while we are half way through a restartpoint. XXX ? >>> */ >>> + LWLockAcquire(CheckpointLock, LW_EXCLUSIVE); >> Is this comment correct? CheckpointLock cannot prevent the startup process >> from exiting recovery because the startup process doesn't acquire that lock. > > The original patch acquired CheckpointLock during exitRecovery to prove > that a restartpoint was not in progress. It no longer does this, so not > sure if Heikki has found another way and the comment is wrong, or that > removing the lock request is a bug. The comment is obsolete. There's no reason for startup process to wait for a restartpoint to finish. Looking back at the archives and the history of that, I got the impression that in a very early version of the patch, startup process still created a shutdown checkpoint after recovery. To do that, it had to interrupt an ongoing restartpoint. That was deemed too dangerous/weird, so it was changed to wait for it to finish instead. But since the startup process no longer creates a shutdown checkpoint, the waiting became obsolete, right? If there's a restartpoint in progress when we reach the end of recovery, startup process will RequestCheckpoint as usual. That will cause bgwriter to hurry the on-going restartpoint, skipping the usual delays, and start the checkpoint as soon as the restartpoint is finished. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
pgsql-hackers by date: