Home > mailing lists

Re: Clock sweep not caching enough B-Tree leaf pages? - Mailing list pgsql-hackers

From	Merlin Moncure
Subject	Re: Clock sweep not caching enough B-Tree leaf pages?
Date	April 17, 2014 18:53:58
Msg-id	CAHyXU0ybRoSfCVL1rUeArAM5+n_j6BQyR2yjs+19+7qm0bnLLg@mail.gmail.com Whole thread Raw
In response to	Re: Clock sweep not caching enough B-Tree leaf pages? (Andres Freund <andres@2ndquadrant.com>)
Responses	Re: Clock sweep not caching enough B-Tree leaf pages? Re: Clock sweep not caching enough B-Tree leaf pages?
List	pgsql-hackers

Tree view

On Thu, Apr 17, 2014 at 1:48 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-04-17 21:44:47 +0300, Heikki Linnakangas wrote:
>> On 04/17/2014 09:38 PM, Stephen Frost wrote:
>> >* Greg Stark (stark@mit.edu) wrote:
>> >>On Thu, Apr 17, 2014 at 12:21 PM, Stephen Frost <sfrost@snowman.net> wrote:
>> >>>Ehhh.  No.  If it's a hot page that we've been holding in *our* cache
>> >>>long enough, the kernel will happily evict it as 'cold' from *its*
>> >>>cache, leading to...
>> >>
>> >>This is a whole nother problem.
>> >>
>> >>It is worrisome that we could be benchmarking the page replacement
>> >>algorithm in Postgres and choose a page replacement algorithm that
>> >>chooses pages that performs well because it tends to evict pages that
>> >>are in the OS cache. And then one day (hopefully not too far off)
>> >>we'll fix the double buffering problem and end up with a strange
>> >>choice of page replacement algorithm.
>> >
>> >That's certainly possible but I don't see the double buffering problem
>> >going away any time particularly soon and, even if it does, it's likely
>> >to either a) mean we're just using the kernel's cache (eg: something w/
>> >mmap, etc), or b) will involve so many other changes that this will end
>> >up getting changed anyway.  In any case, while I think we should
>> >document any such cache management system we employ as having this risk,
>> >I don't think we should worry about it terribly much.
>>
>> Note that if we somehow come up with a page replacement algorithm that tends
>> to evict pages that are in the OS cache, we have effectively solved the
>> double buffering problem. When a page is cached in both caches, evicting it
>> from one of them eliminates the double buffering. Granted, you might prefer
>> to evict it from the OS cache instead, and such an algorithm could be bad in
>> other ways. But if a page replacement algorithm happens avoid double
>> buffering, that's a genuine merit for that algorithm.
>
> I don't think it's a good idea to try to synchronize algorithms with the
> OSs. There's so much change about the caching logic in e.g. linux that
> it won't stay effective for very long.

No. but if you were very judicious, maybe you could hint the o/s
(posix_fadvise) about pages that are likely to stay hot that you don't
need them.

I doubt that's necessary though -- if the postgres caching algorithm
improves such that there is a better tendency for hot pages to stay in
s_b,  Eventually the O/S will deschedule the page for something else
that needs it.   In other words, otherwise preventable double
buffering is really a measurement of bad eviction policy because it
manifests in volatility of frequency accessed pages.

merlin

pgsql-hackers by date:

From: Stephen Frost
Date: 17 April 2014, 18:51:35
Subject: Re: Clock sweep not caching enough B-Tree leaf pages?

From: Peter Geoghegan
Date: 17 April 2014, 18:55:56
Subject: Re: Clock sweep not caching enough B-Tree leaf pages?

Re: Clock sweep not caching enough B-Tree leaf pages? - Mailing list pgsql-hackers

Previous

Next