Home > mailing lists

Re: Should we update the random_page_cost default value? - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: Should we update the random_page_cost default value?
Date	October 8 19:24:33
Msg-id	46307eb5-f691-4251-b71e-f33bcf5e5375@vondra.me Whole thread Raw
In response to	Re: Should we update the random_page_cost default value? (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: Should we update the random_page_cost default value?
List	pgsql-hackers

Tree view


On 10/8/25 17:20, Robert Haas wrote:
> On Tue, Oct 7, 2025 at 3:15 PM Greg Sabino Mullane <htamfids@gmail.com> wrote:
>> I've been doing this sort of thing for clients a long time, and I always test both directions when I come across a
querythat should be faster. For real-world queries, 99% of them have no change or improve with a lowered rpc, and 99%
getworse via a raised rpc. So color me unconvinced.
 
> 
> Color me equally unconvinced. Tomas seems quite convinced that we
> ought to be raising random_page_cost rather than lowering it, and that
> absolutely does not correspond to my experience in any way. It's not
> that I've actually tested raising the random page cost on very many
> systems, mind you. It's that my experience with real-world systems is
> that we tend to pick sequential scans when we should use an index, not
> the other way around. And obviously that problem will never be fixed
> by raising random_page_cost, since that will tend to favor sequential
> scans even more.
> 
> Tomas's test involves scanning big tables that don't fit in RAM, and I
> agree that if that's most of what you do, you might benefit from a
> higher random_page_cost. However, even users who have some tables that
> are a lot bigger than RAM also tend to have frequently-accessed tables
> that are much smaller. For example, if you join a fact table to a
> bunch of dimension tables, the fact table may not fit in RAM, but the
> dimension tables probably do, and we're using the same
> random_page_cost for all of those tables. Moreover, we're least likely
> to make a wrong decision about what to do about the big table. It will
> often be the case that the query will be phrased such that a
> sequential scan on the big table is unavoidable, and if it is possible
> to avoid a sequential scan, we're going to want to do so almost
> regardless of random_page_cost because the number of pages accesses we
> can save will tend to be large. On a smaller table, it's more likely
> that both a full table scan and an index-based approach will be
> competitive, so there the value of random_page_cost will matter more
> to the final outcome. So, in this scenario, it's more important that
> the random_page_cost is close to accurate for the smaller tables
> rather than for the larger ones, and those are the ones that are most
> likely to benefit from caching.
> 

I don't think there's all that much disagreement, actually. This is a
pretty good illustration that we're using random_page_cost to account
for things other than "I/O cost" (like the expected cache hit ratios),
because we simply don't have a better knob for that.

> One of the planner changes that I think would be worth exploring is to
> have the system try to estimate the percentage of a given table that
> is likely to be in cache, and that could be configured via a reloption
> or estimated based on the size of the table (or maybe even the
> frequency of access, though that is fraught, since it can change
> precipitously on short time scales and is thus not great to use for
> planning). If we estimated that small tables are likely to be cached
> and bigger ones are likely to be less-cached or, if very big,
> completely uncached, then it would probably be right to raise
> random_page_cost as well. But without that kind of a change, the
> correct value of random_page_cost is the one that takes into account
> both the possibility of caching and the possibility of nothing being
> cached, and this test is rotated all the way toward one end of that
> spectrum.
> 

Isn't this somewhat what effective_cache_size was meant to do? That
obviously does not know about what fraction of individual tables is
cached, but it does impose size limit.

I think in the past we mostly assumed we can't track cache size per
table, because we have no visibility into page cache. But maybe direct
I/O would change this?


regards

-- 
Tomas Vondra

pgsql-hackers by date:

From: Tomas Vondra
Date: 08 October, 19:17:50
Subject: Re: Should we update the random_page_cost default value?

From: Masahiko Sawada
Date: 08 October, 19:33:40
Subject: Re: VACUUM (PARALLEL) option processing not using DefElem the way it was intended

Re: Should we update the random_page_cost default value? - Mailing list pgsql-hackers

Previous

Next