Home > mailing lists

Re: Avoiding adjacent checkpoint records - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: Avoiding adjacent checkpoint records
Date	June 7, 2012 11:04:36
Msg-id	CA+TgmoZbUQrnpQjixDe55qnECovQnzhZCiZ=n0VvLHcGPsuwtQ@mail.gmail.com Whole thread Raw
In response to	Re: Avoiding adjacent checkpoint records (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Avoiding adjacent checkpoint records
List	pgsql-hackers

Tree view

On Wed, Jun 6, 2012 at 6:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
>> Actually, it looks like there is an extremely simple way to handle this,
>> which is to move the call of LogStandbySnapshot (which generates the WAL
>> record in question) to before the checkpoint's REDO pointer is set, but
>> after we have decided that we need a checkpoint.
>
> On further contemplation, there is a downside to that idea, which
> probably explains why the code was written as it was: if we place the
> XLOG_RUNNING_XACTS WAL record emitted during a checkpoint before rather
> than after the checkpoint's REDO point, then a hot standby slave
> starting up from that checkpoint won't process the XLOG_RUNNING_XACTS
> record.  That means its KnownAssignedXids machinery won't be fully
> operational until the master starts another checkpoint, which might be
> awhile.  So this could result in undesirable delay in hot standby mode
> becoming active.
>
> I am not sure how significant this really is though.  Comments?

I suspect that's pretty significant.

> If we don't like that, I can think of a couple of other ways to get there,
> but they have their own downsides:
>
> * Instead of trying to detect after-the-fact whether any concurrent
> WAL activity happened during the last checkpoint, we could detect it
> during the checkpoint and then keep the info in a static variable in
> the checkpointer process until next time.  However, I don't see any
> bulletproof way to do this without adding at least one or two lines
> of code within XLogInsert, which I'm sure Robert will complain about.

My main concern here is to avoid doing anything that will make things
harder for Heikki's WAL insert scaling patch, which I'm hoping will
get done for 9.3.

What do you have in mind, exactly?  I feel like ProcLastRecPtr might
be enough information.  After logging running xacts, we can check
whether ProcLastRecPtr is equal to the redo pointer.  If so, then
nothing got written to WAL between the time we began the checkpoint
and the time we wrote that record.  If, through further, similar
gyrations, we can then work out whether the checkpoint record
immediately follows the running-xacts record, we're there.  That's
pretty ugly, I guess, but it seems possible.

Alternatively, we could keep a flag in XLogCtl->Insert indicating
whether anything that requires a new checkpoint has happened since the
last checkpoint.  This could be set inside the existing block that
tests whether RedoRecPtr is out of date, so any given backend would
only do it once per checkpoint cycle.  We'd have a hard-coded special
case that would skip setting the flag for a running-xacts record.  I
kind of hate to shove even that much extra code in there, but as long
as it doesn't mess up what Heikki has in mind maybe it'd be OK...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: "Kevin Grittner"
Date: 07 June 2012, 11:00:21
Subject: Re: Avoiding adjacent checkpoint records

From: Tom Lane
Date: 07 June 2012, 11:14:13
Subject: Re: Could we replace SysV semaphores with latches?

Re: Avoiding adjacent checkpoint records - Mailing list pgsql-hackers

Previous

Next