Re: Avoiding adjacent checkpoint records - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Avoiding adjacent checkpoint records |
Date | |
Msg-id | CA+TgmoZbUQrnpQjixDe55qnECovQnzhZCiZ=n0VvLHcGPsuwtQ@mail.gmail.com Whole thread Raw |
In response to | Re: Avoiding adjacent checkpoint records (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Avoiding adjacent checkpoint records
|
List | pgsql-hackers |
On Wed, Jun 6, 2012 at 6:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: >> Actually, it looks like there is an extremely simple way to handle this, >> which is to move the call of LogStandbySnapshot (which generates the WAL >> record in question) to before the checkpoint's REDO pointer is set, but >> after we have decided that we need a checkpoint. > > On further contemplation, there is a downside to that idea, which > probably explains why the code was written as it was: if we place the > XLOG_RUNNING_XACTS WAL record emitted during a checkpoint before rather > than after the checkpoint's REDO point, then a hot standby slave > starting up from that checkpoint won't process the XLOG_RUNNING_XACTS > record. That means its KnownAssignedXids machinery won't be fully > operational until the master starts another checkpoint, which might be > awhile. So this could result in undesirable delay in hot standby mode > becoming active. > > I am not sure how significant this really is though. Comments? I suspect that's pretty significant. > If we don't like that, I can think of a couple of other ways to get there, > but they have their own downsides: > > * Instead of trying to detect after-the-fact whether any concurrent > WAL activity happened during the last checkpoint, we could detect it > during the checkpoint and then keep the info in a static variable in > the checkpointer process until next time. However, I don't see any > bulletproof way to do this without adding at least one or two lines > of code within XLogInsert, which I'm sure Robert will complain about. My main concern here is to avoid doing anything that will make things harder for Heikki's WAL insert scaling patch, which I'm hoping will get done for 9.3. What do you have in mind, exactly? I feel like ProcLastRecPtr might be enough information. After logging running xacts, we can check whether ProcLastRecPtr is equal to the redo pointer. If so, then nothing got written to WAL between the time we began the checkpoint and the time we wrote that record. If, through further, similar gyrations, we can then work out whether the checkpoint record immediately follows the running-xacts record, we're there. That's pretty ugly, I guess, but it seems possible. Alternatively, we could keep a flag in XLogCtl->Insert indicating whether anything that requires a new checkpoint has happened since the last checkpoint. This could be set inside the existing block that tests whether RedoRecPtr is out of date, so any given backend would only do it once per checkpoint cycle. We'd have a hard-coded special case that would skip setting the flag for a running-xacts record. I kind of hate to shove even that much extra code in there, but as long as it doesn't mess up what Heikki has in mind maybe it'd be OK... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: