Re: Adding basic NUMA awareness - Preliminary feedback and outline for an extensible approach - Mailing list pgsql-hackers
From | Cédric Villemain |
---|---|
Subject | Re: Adding basic NUMA awareness - Preliminary feedback and outline for an extensible approach |
Date | |
Msg-id | c892aa85-9e09-42e5-bf74-2302f9693bf4@data-bene.io Whole thread Raw |
In response to | Adding basic NUMA awareness (Tomas Vondra <tomas@vondra.me>) |
List | pgsql-hackers |
>> On 7/7/25 16:51, Cédric Villemain wrote: >>>>> * Others might use it to integrate PostgreSQL's own resources (e.g., >>>>> "areas" of shared buffers) into policies. >>>>> >>>>> Hope this perspective is helpful. >>>> >>>> Can you explain how you want to manage this by an extension defined at >>>> the SQL level, when most of this stuff has to be done when setting up >>>> shared memory, which is waaaay before we have any access to catalogs? >>> >>> I should have said module instead, I didn't follow carefully but at some >>> point there were discussion about shared buffers resized "on-line". >>> Anyway, it was just to give some few examples, maybe this one is to be >>> considered later (I'm focused on cgroup/psi, and precisely reassigning >>> PIDs as needed). >>> >> >> I don't know. I have a hard time imagining what exactly would the >> policies / profiles do exactly to respond to changes in the system >> utilization. And why should that interfere with this patch ... >> >> The main thing patch series aims to implement is partitioning different >> pieces of shared memory (buffers, freelists, ...) to better work for >> NUMA. I don't think there's that many ways to do this, and I doubt it >> makes sense to make this easily customizable from external modules of >> any kind. I can imagine providing some API allowing to isolate the >> instance on selected NUMA nodes, but that's about it. >> >> Yes, there's some relation to the online resizing of shared buffers, in >> which case we need to "refresh" some of the information. But AFAICS it's >> not very extensive (on top of what already needs to happen after the >> resize), and it'd happen within the boundaries of the partitioning >> scheme. There's not that much flexibility. >> >> The last bit (pinning backends to a NUMA node) is experimental, and >> mostly intended for easier evaluation of the earlier parts (e.g. to >> limit the noise when processes get moved to a CPU from a different NUMA >> node, and so on). > > The backend pinning can be done by replacing your patch on proc.c to > call an external profile manager doing exactly the same thing maybe ? > > Similar to: > pmroutine = GetPmRoutineForInitProcess(); > if (pmroutine != NULL && > pmroutine->init_process != NULL) > pmroutine->init_process(MyProc); > > ... > > pmroutine = GetPmRoutineForInitAuxilliary(); > if (pmroutine != NULL && > pmroutine->init_auxilliary != NULL) > pmroutine->init_auxilliary(MyProc); > > Added on some rare places should cover most if not all the requirement > around process placement (process_shared_preload_libraries() is called > earlier in the process creation I believe). > After a first read I think this works for patches 002 and 005. For this last one, InitProcGlobal() may setup things as you do but then expose the choice a bit later, basically in places where you added the if condition on the GUC: numa_procs_interleave). -- Cédric Villemain +33 6 20 30 22 52 https://www.Data-Bene.io PostgreSQL Support, Expertise, Training, R&D
pgsql-hackers by date: