Home > mailing lists

Re: Should we update the random_page_cost default value? - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: Should we update the random_page_cost default value?
Date	October 9 17:47:03
Msg-id	92873bc5-0dd3-4ecb-bea1-c2a00c9535ee@vondra.me Whole thread Raw
In response to	Re: Should we update the random_page_cost default value? (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: Should we update the random_page_cost default value?
List	pgsql-hackers

Tree view

On 10/9/25 14:17, Robert Haas wrote:
> On Wed, Oct 8, 2025 at 4:20 PM Tomas Vondra <tomas@vondra.me> wrote:
>> What I imagined is more like a process that regularly walks shared
>> buffers, counts buffers per relation (or relfilenode), stores the
>> aggregated info into some shared memory (so that standby can have it's
>> own concept of cache contents). And then invalidates plans the same way
>> ANALYZE does.
> 
> I'm not saying we couldn't do this, but I have doubts about how well
> it would really work. I have a feeling, for example, that it might
> sometimes cause the planner to go to great lengths to avoid bringing a
> small table fully into memory, based on the knowledge that the table
> is not cached. But the right thing to do could easily be to take the
> hit of doing an expensive sequential scan on first access, and then
> after that the table stays cached. I think it could very possibly be
> the case that such a strategy is faster even for the first execution
> and also more stable. In my experience, if you run pgbench -S on a
> system with a cold cache, it takes just about forever to warm the
> cache via random access; a single pg_prewarm can avoid a very long
> period of slowness.
> 

I agree whatever we do should not penalize plans that pull small tables
into caches. But I don't see why would it have such effect? Isn't this
pretty much what effective_cache_size does (for index scans), where it
estimates if a read was already cached by the same query? And it does
not have this effect, right?

Also, isn't the "seqscan is faster with cold cache, but with warm cache
index scans win" pretty much exactly the planning decision this would be
expected to help with?

> I think Andres's idea of trying to account for things like backward
> scans being slower, or prefetching, or AIO, is probably a better
> direction. I don't know how well we can take those things into account
> or whether it will fully get us where we want to be, but we can model
> those events in ways that are stable: the cost is just always
> different than today, without relying on any new kinds of stats. I
> think that's probably good.
> 

I agree with this. But isn't this a mostly orthogonal problem? I mean,
shouldn't we do both (or try to)?

> In my experience, we get complaints about plans changing when people
> didn't want them to all the time, but we very rarely get complaints
> about plans not changing when people did want them to. So I think
> trying not to depend on things that can be unstable is probably a good
> direction.
> 

Perhaps. I certainly understand the reluctance to depend on inherently
unstable information. But would this be so unstable? I'm not sure. And
even if it changes fairly often, because the cache content changes, but
gives you the best plan thanks to that, is that an issue?

Also, if users complain about A and not about B, does that mean B is not
an issue? In The Hitchhiker’s Guide to the Galaxy, there's a monster
"Ravenous Bugblatter Beast of Traal", described like this:

    ... a mind-boggingly stupid animal, it assumes that if you can’t
    see it, it can’t see you.

Isn't ignoring "B" a bit like that? Of course, maybe B really is not
worth worrying about. Or maybe this is not the right approach to address
it, I don't know.

regards

-- 
Tomas Vondra

pgsql-hackers by date:

From: Nathan Bossart
Date: 09 October, 17:45:15
Subject: Re: Fix for compiler warning triggered in WinGetFuncArgInPartition()

From: jian he
Date: 09 October, 17:50:34
Subject: Re: speedup COPY TO for partitioned table.

Re: Should we update the random_page_cost default value? - Mailing list pgsql-hackers

Previous

Next