Home > mailing lists

Re: Avoiding unnecessary reads in recovery - Mailing list pgsql-hackers

From	Jim Nasby
Subject	Re: Avoiding unnecessary reads in recovery
Date	April 26, 2007 04:51:15
Msg-id	6B1B90A7-5BF5-42C4-8238-7F774701BB6C@decibel.org Whole thread Raw
In response to	Avoiding unnecessary reads in recovery (Heikki Linnakangas <heikki@enterprisedb.com>)
Responses	Re: Avoiding unnecessary reads in recovery Re: Avoiding unnecessary reads in recovery
List	pgsql-hackers

Tree view

On Apr 25, 2007, at 2:48 PM, Heikki Linnakangas wrote:
> In recovery, with full_pages_writes=on, we read in each page only  
> to overwrite the contents with a full page image. That's a waste of  
> time, and can have a surprisingly large effect on recovery time.
>
> As a quick test on my laptop, I initialized a DBT-2 test with 5  
> warehouses, and let it run for 2 minutes without think-times to  
> generate some WAL. Then I did a "kill -9 postmaster", and took a  
> copy of the data directory to use for testing recovery.
>
> With CVS HEAD, the recovery took ~ 2 minutes. With the attached  
> patch, it took 5 seconds. (yes, I used the same not-yet-recovered  
> data directory in both tests, and cleared the os cache with "echo 1  
> > /proc/sys/vm/drop_caches").
>
> I was surprised how big a difference it makes, but when you think  
> about it it's logical. Without the patch, it's doing roughly the  
> same I/O as the test itself, reading in pages, modifying them, and  
> writing them back. With the patch, all the reads are done  
> sequentially from the WAL, and then written back in a batch at the  
> end of the WAL replay which is a lot more efficient.
>
> It's interesting that (with the patch) full_page_writes can  
> *shorten* your recovery time. I've always thought it to have a  
> purely negative effect on performance.
>
> I'll leave it up to the jury if this tiny little change is  
> appropriate after feature freeze...
>
> While working on this, this comment in ReadBuffer caught my eye:
>
>>     /*
>>      * During WAL recovery, the first access to any data page should
>>      * overwrite the whole page from the WAL; so a clobbered page
>>      * header is not reason to fail.  Hence, when InRecovery we may
>>      * always act as though zero_damaged_pages is ON.
>>      */
>>     if (zero_damaged_pages || InRecovery)
>>     {
>
> But that assumption only holds if full_page_writes is enabled,  
> right? I changed that in the attached patch as well, but if it  
> isn't accepted that part of it should still be applied, I think.

So what happens if a backend is running with full_page_writes = off,  
someone edits postgresql.conf to turns it on and forgets to reload/ 
restart, and then we crash? You'll come up in recovery mode thinking  
that f_p_w was turned on, when in fact it wasn't.

ISTM that we need to somehow log what the status of full_page_writes  
is, if it's going to affect how recovery works.
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

pgsql-hackers by date:

From: Jim Nasby
Date: 26 April 2007, 04:47:14
Subject: Re: Improving deadlock error messages

From: Heikki Linnakangas
Date: 26 April 2007, 05:29:44
Subject: Re: Avoiding unnecessary reads in recovery

Re: Avoiding unnecessary reads in recovery - Mailing list pgsql-hackers

Previous

Next