Re: Should we update the random_page_cost default value? - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Should we update the random_page_cost default value?
Date
Msg-id f99e48ca-e07e-4cb8-bd42-ef08a3ed1e68@vondra.me
Whole thread Raw
In response to Re: Should we update the random_page_cost default value?  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers

On 10/8/25 22:52, Andres Freund wrote:
> Hi,
> 
> On 2025-10-08 22:20:31 +0200, Tomas Vondra wrote:
>> On 10/8/25 21:37, Andres Freund wrote:
>>> On 2025-10-08 21:25:53 +0200, Tomas Vondra wrote:
>>>> On 10/8/25 19:23, Robert Haas wrote:
>>>>>> I think in the past we mostly assumed we can't track cache size per
>>>>>> table, because we have no visibility into page cache. But maybe direct
>>>>>> I/O would change this?
>>>>>
>>>>> I think it's probably going to work out really poorly to try to use
>>>>> cache contents for planning. The plan may easily last much longer than
>>>>> the cache contents.
>>>>>
>>>>
>>>> Why wouldn't that trigger invalidations / replanning just like other
>>>> types of stats? I imagine we'd regularly collect stats about what's
>>>> cached, etc. and we'd invalidate stale plans just like after ANALYZE.
>>>
>>> You can't just handle it like other such stats - the contents of
>>> shared_buffers can differ between primary and standby and other stats that
>>> trigger replanning are all in system tables that can't differ between primary
>>> and hot standby instances.
>>>
>>> We IIRC don't currently use shared memory stats for planning and thus have no
>>> way to trigger invalidation for relevant changes. While it seems plausible to
>>> drive this via shared memory stats, the current cumulative counters aren't
>>> really suitable, we'd either need something that removes the influence of
>>> olders hits/misses or a new field tracking the current number of buffers for a
>>> relation [fork].
>>>
>>
>> I don't think I mentioned pgstat (i.e. the shmem stats) anywhere, and I
>> mentioned ANALYZE which has nothing to do with pgstats either. So I'm a
>> bit confused why you argue we can't use pgstat.
> 
> I'm mentioning pgstats because we can't store stats like ANALYZE otherwise
> does, due to that being in catalog tables. Given that, why wouldn't we store
> the cache hit ratio in pgstats?
> 
> It'd be pretty weird to overload this into ANALYZE imo, given that this would
> be the only stat that we'd populate on standbys in ANALYZE. We'd also need to
> start running AV on standbys for it.
> 

I didn't say it should be done by ANALYZE (and indeed that would be
weird). I said the plans might be invalidated just like after ANALYZE.

And sure, it could be stored in pgstat, sure. Or we'd invent something
new, I don't know. I can imagine both things. I kinda dislike how people
do pg_stat_reset() to reset runtime stats, not realizing it breaks
autovacuum. I'm not sure it's wise to make it break plans too.

> 
>> What I imagined is more like a process that regularly walks shared
>> buffers, counts buffers per relation (or relfilenode), stores the
>> aggregated info into some shared memory (so that standby can have it's
>> own concept of cache contents).
> 
> That shared memory datastructure basically describes pgstats, no?
> 
> 
>> And then invalidates plans the same way ANALYZE does.
> 
> I'm not sure the invalidation machinery actually fully works in HS (due to
> doing things like incrementing the command counter). It would probably be
> doable to change that though.
> 

Yeah, I'm not pretending I have this thought through, considering I only
started thinking about it about an hour ago.

regards

-- 
Tomas Vondra




pgsql-hackers by date:

Previous
From: Jacob Champion
Date:
Subject: Re: Thoughts on a "global" client configuration?
Next
From: Shayon Mukherjee
Date:
Subject: [PATCH] Proposal: Allow reads to proceed during FK/trigger drops by reducing relation-level lock from AccessExclusive to ShareRowExclusive