Re: Adding basic NUMA awareness - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Adding basic NUMA awareness
Date
Msg-id 659c44a5-f616-492c-ab81-60273d2fe7f6@vondra.me
Whole thread Raw
In response to Re: Adding basic NUMA awareness  (Tomas Vondra <tomas@vondra.me>)
List pgsql-hackers
On 9/11/25 10:32, Tomas Vondra wrote:
> ...
>
> 8) I've realized some of the TAP tests occasionally fail with
> 
>     ERROR: no unpinned buffers
> 
> and I think I know why. Some of the tests set shared_buffers to a very
> low value - like 1MB or even 128kB, and StrategyGetBuffer() may search
> only a single partition (but not always). We may run out of unpinned
> buffers in that one partition.
> 
> This apparently happens more easily on rpi5, due to the weird NUMA
> layout (there are 8 nodes with memory, but getcpu() reports node 0 for
> all cores).
> 
> I suspect the correct fix is to ensure StrategyGetBuffer() scans all
> partitions, if there are no unpinned buffers in the current one. On
> realistic setups this shouldn't happen very often, I think.
> 
> The other issue I just realized is that StrategyGetBuffer() recalculates
> the partition index over and over, which seems unnecessary (and possibly
> expensive, due to the modulo). And it also does too many loops, because
> it used NBuffers instead of the partition size. I'll fix those later.

Here's a version fixing this issue (in the 0006 part). It modifies
StrategyGetBuffer() to walk through all the partitions, in a round-robin
manner. The way it steps to the next partition is a bit ugly, but it
works and I'll think about some better way.

I haven't done anything about the other issue (the one with huge pages
reserved on NUMA nodes, and SIGBUS).

regards

-- 
Tomas Vondra
Attachment

pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: GetNamedLWLockTranche crashes on Windows in normal backend
Next
From: Matheus Alcantara
Date:
Subject: Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue