Re: Adding basic NUMA awareness - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Adding basic NUMA awareness
Date
Msg-id ndvygkpdx44pmi4xbkf52gfrl77cohpefr42tipvd5dgiaeuyd@fe2og2kxyjnc
Whole thread Raw
In response to Re: Adding basic NUMA awareness  (Greg Burd <greg@burd.me>)
Responses Re: Adding basic NUMA awareness
Re: Adding basic NUMA awareness
List pgsql-hackers
Hi,

On 2025-07-09 12:55:51 -0400, Greg Burd wrote:
> On Jul 9 2025, at 12:35 pm, Andres Freund <andres@anarazel.de> wrote:
>
> > FWIW, I've started to wonder if we shouldn't just get rid of the freelist
> > entirely. While clocksweep is perhaps minutely slower in a single
> > thread than
> > the freelist, clock sweep scales *considerably* better [1]. As it's rather
> > rare to be bottlenecked on clock sweep speed for a single thread
> > (rather then
> > IO or memory copy overhead), I think it's worth favoring clock sweep.
>
> Hey Andres, thanks for spending time on this.  I've worked before on
> freelist implementations (last one in LMDB) and I think you're onto
> something.  I think it's an innovative idea and that the speed
> difference will either be lost in the noise or potentially entirely
> mitigated by avoiding duplicate work.

Agreed. FWIW, just using clock sweep actually makes things like DROP TABLE
perform better because it doesn't need to maintain the freelist anymore...


> > Also needing to switch between getting buffers from the freelist and
> > the sweep
> > makes the code more expensive.  I think just having the buffer in the sweep,
> > with a refcount / usagecount of zero would suffice.
>
> If you're not already coding this, I'll jump in. :)

My experimental patch is literally a four character addition ;), namely adding
"0 &&" to the relevant code in StrategyGetBuffer().

Obviously a real patch would need to do some more work than that.  Feel free
to take on that project, I am not planning on tackling that in near term.


There's other things around this that could use some attention. It's not hard
to see clock sweep be a bottleneck in concurrent workloads - partially due to
the shared maintenance of the clock hand. A NUMAed clock sweep would address
that. However, we also maintain StrategyControl->numBufferAllocs, which is a
significant contention point and would not necessarily be removed by a
NUMAificiation of the clock sweep.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Adding basic NUMA awareness
Next
From: Andres Freund
Date:
Subject: Re: Improving and extending int128.h to more of numeric.c