Home > mailing lists

Re: [HACKERS] Clock with Adaptive Replacement - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: [HACKERS] Clock with Adaptive Replacement
Date	April 26, 2018 20:39:47
Msg-id	CA+Tgmoaw-3_3T1sVoiDKCL-zop9j2kqeYChV=hfszkoU1+EenA@mail.gmail.com Whole thread Raw
In response to	Re: [HACKERS] Clock with Adaptive Replacement (Peter Geoghegan <pg@bowt.ie>)
Responses	Re: [HACKERS] Clock with Adaptive Replacement
List	pgsql-hackers

Tree view

On Wed, Apr 25, 2018 at 6:54 PM, Peter Geoghegan <pg@bowt.ie> wrote:
> Before some of the big shared_buffers bottlenecks were alleviated
> several years ago, it was possible to observe shared_buffers evictions
> occurring essentially at random. I have no idea if that's still true,
> but it could be.

I think it is.  We haven't done anything to address it.  I think if we
want to move to direct I/O -- which may be something we need or want
to do -- we're going to have to work a lot harder at making good page
eviction decisions.  Your patch to change the page eviction algorithm
didn't help noticeably once we eliminated the contention around buffer
eviction, but that's just because the cost of a bad eviction went
down, not because we stopped doing bad evictions.  I think it would be
interesting to insert a usleep() call into mdread() and then test
buffer eviction various algorithms with that in place.

I'm personally not very excited about making rules like "index pages
are more valuable than heap pages".  Such rules will in many cases be
true, but it's easy to come up with cases where they don't hold: for
example, we might run pgbench for a while and then stop running
pgbench and start running big sequential scans for reporting purposes.
We don't want to artificially pin the index buffers in shared_buffers
just because they're index pages; we want to figure out which pages
really matter.  Now, I realize that you weren't proposing (and
wouldn't propose) a rule that index pages never get evicted, but I
think that favoring index pages even in some milder way is basically a
hack.  Index pages aren't *intrinsically* more valuable; they are more
valuable because they will, in many workloads, be accessed more often
than heap pages.  A good algorithm ought to be able to figure that out
based on the access pattern, without being explicitly given a hint, I
think.

I believe the root of the problem here is that the usage count we have
today does a very poor job distinguishing what's hot from what's not.
There have been previous experiments around making usage_count use
some kind of a log scale: we make the maximum, say, 32, and the clock
hand divides by 2 instead of subtracting 1.  I don't think those
experiments were enormously successful and I suspect that a big part
of the reason is that it's still pretty easy to get to a state where
the counters are maxed out for a large number of buffers, and at that
point you can't distinguish between those buffers any more: they all
look equally hot.  We need something better.  If a system like this is
working properly, things like interior index pages and visibility map
pages ought to show up as fiery hot on workloads where the index or
visibility map, as the case may be, is heavily used.

A related problem is that user-connected backends end up doing a lot
of buffer eviction themselves on many workloads.  Maybe the
bgreclaimer patch Amit wrote a few years ago could help somehow.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Tom Lane
Date: 26 April 2018, 19:33:58
Subject: Re: perltidy tweaks

From: Andrew Dunstan
Date: 26 April 2018, 21:10:31
Subject: Re: perltidy tweaks

Re: [HACKERS] Clock with Adaptive Replacement - Mailing list pgsql-hackers

Previous

Next