Re: Should we update the random_page_cost default value? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Should we update the random_page_cost default value?
Date
Msg-id gyc3hfrqqqwzczmbla3rcnp3hzjzrmejrwery5d3loj3wxjnyw@higmr5wkslfd
Whole thread Raw
In response to Re: Should we update the random_page_cost default value?  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,

On 2025-10-08 13:23:33 -0400, Robert Haas wrote:
> On Wed, Oct 8, 2025 at 12:24 PM Tomas Vondra <tomas@vondra.me> wrote:
> > Isn't this somewhat what effective_cache_size was meant to do? That
> > obviously does not know about what fraction of individual tables is
> > cached, but it does impose size limit.
> 
> Not really, because effective_cache_size only models the fact that
> when you iterate the same index scan within the execution of a single
> query, it will probably hit some pages more than once.

That's indeed today's use, but I wonder whether we ought to expand that. One
of the annoying things about *_page_cost effectively needing to be set "too
low" to handle caching effects is that that completely breaks down for larger
relations. Which has unwelcome effects like making a > memory sequential scan
seem like a reasonable plan.

It's a generally reasonable assumption that a scan processing a smaller amount
of data than effective_cache_size is more likely to cached than a scan that is
processing much more data than effective_cache_size. In the latter case,
assuming an accurate effective_cache_size, we *know* that a good portion of
the data cannot be cached.

Which leads me to wonder if we ought to interpolate between a "cheaper" access
cost for data << effective_cache_size and the "more real" access costs the
closer the amount of data gets to effective_cache_size.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Sami Imseih
Date:
Subject: Re: another autovacuum scheduling thread
Next
From: vignesh C
Date:
Subject: Re: Invalid pointer access in logical decoding after error