Re: Add os_page_num to pg_buffercache - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Add os_page_num to pg_buffercache
Date
Msg-id e2bfe8a0-70c5-49d8-bedb-c643b90930f8@vondra.me
Whole thread Raw
In response to Re: Add os_page_num to pg_buffercache  (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>)
Responses Re: Add os_page_num to pg_buffercache
List pgsql-hackers
On 7/1/25 15:45, Bertrand Drouvot wrote:
> Hi,
> 
> On Thu, Apr 10, 2025 at 03:05:29PM +0000, Bertrand Drouvot wrote:
>> Hi,
>>
>> On Thu, Apr 10, 2025 at 09:58:18AM -0500, Nathan Bossart wrote:
>>> On Thu, Apr 10, 2025 at 03:35:24PM +0200, Tomas Vondra wrote:
>>>> This seems like a good idea in principle, but at this point it has to
>>>> wait for PG19. Please add it to the July commitfest.
>>>
>>> +1.  From a glance, this seems to fall in the "new feature" bucket and
>>> should likely wait for v19.
>>
>> Thank you both for providing your thoughts that confirm my initial doubt. Let's
>> come back to that one later then.
>>
> 
> Here we are.
> 
> Please find attached a rebased version and while at it, v2 adds a new macro and
> a function to avoid some code duplication between pg_buffercache_pages() and
> pg_buffercache_numa_pages().
> 
> So, PFA:
> 
> 0001 - Introduce GET_MAX_BUFFER_ENTRIES and get_buffer_page_boundaries
> 
> Those new macro and function are extracted from pg_buffercache_numa_pages() and
> pg_buffercache_numa_pages() makes use of them.
> 
> 0002 - Add os_page_num to pg_buffercache
> 
> Making use of the new macro and function from 0001.
> 
> As it's for v19, also bumping pg_buffercache's version to 1.7.
> 

Thanks for the updated patch!

I took a quick look on this, and I doubt we want to change the schema of
pg_buffercache like this. Adding columns is fine, but it seems rather
wrong to change the cardinality. The view is meant to be 1:1 mapping for
buffers, but now suddenly it's 1:1 with memory pages. Or rather (buffer,
page), to be precise.

I think this will break a lot of monitoring queries, and possibly in a
very subtle way - especially on systems with huge pages, where most
buffers will have one row, but then a buffer that happens to be split on
two pages will have two rows. That seems not great.

Just look at the changes needed in regression tests, where the first
test now needs to be

  -select count(*) = (select setting::bigint
  +select count(*) >= (select setting::bigint
                    from pg_settings
                    where name = 'shared_buffers')

which seems like a much weaker check.

IMHO it'd be better to have a new view for this info, something like
pg_buffercache_pages, or something like that.

But I'm also starting to question if the patch really is that useful.
Sure, people may not have NUMA support enabled (e.g. on non-linux
platforms), and even if they do the _numa view is quite expensive.

But I don't recall ever asking how the buffers map to memory pages. At
least not before/without the NUMA stuff.


regards

-- 
Tomas Vondra




pgsql-hackers by date:

Previous
From: Melih Mutlu
Date:
Subject: Re: Report replica identity in pg_publication_tables
Next
From: torikoshia
Date:
Subject: Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row