Re: Adding basic NUMA awareness - Mailing list pgsql-hackers
| From | Tomas Vondra |
|---|---|
| Subject | Re: Adding basic NUMA awareness |
| Date | |
| Msg-id | 2db78610-b480-4aa0-a1b6-57f1c2dcb708@vondra.me Whole thread Raw |
| In response to | Re: Adding basic NUMA awareness (Andres Freund <andres@anarazel.de>) |
| Responses |
Re: Adding basic NUMA awareness
Re: Adding basic NUMA awareness |
| List | pgsql-hackers |
On 1/13/26 01:24, Andres Freund wrote: > Hi, > > On 2026-01-12 19:10:00 -0500, Andres Freund wrote: >> On 2026-01-13 00:58:49 +0100, Tomas Vondra wrote: >>> On 1/10/26 02:42, Andres Freund wrote: >>>> psql -Xq -c 'SELECT pg_buffercache_evict_all();' -c 'SELECT numa_node, sum(size) FROM pg_shmem_allocations_numa GROUPBY 1;' && perf stat --per-socket -M memory_bandwidth_read,memory_bandwidth_write -a psql -c 'SELECT sum(abalance) FROMpgbench_accounts;' >> >>> And then I initialized pgbench with scale that is much larger than >>> shared buffers, but fits into RAM. So cached, but definitely > NB/4. And >>> then I ran >>> >>> select * from pgbench_accounts offset 1000000000; >>> >>> which does a sequential scan with the circular buffer you mention abobe >> >> Did you try it with the query I suggested? One plausible reason why you did >> not see an effect with your query is that with a huge offset you actually >> never deform the tuple, which is an important and rather latency sensitive >> path. > > Btw, this doesn't need anywhere close to as much data, it should be visible as > soon as you're >> L3. > > To show why > SELECT * FROM pgbench_accounts OFFSET 100000000 > doesn't show an effect but > SELECT sum(abalance) FROM pgbench_accounts; > > does, just look at the difference using the perf command I posted. Here on a > scale 200. > OK, I tried with smaller scale (and larger shared buffers, to make the data set smaller than NBuffers/4). On the azure VM (scale 200, 32GB sb), there's still no difference: numactl --membind 0 --cpunodebind 0 297.770 ms numactl --membind 0 --cpunodebind 1 297.924 ms and on xeon (scale 100, 8GB sb), there's a bit of a difference: numactl --membind 0 --cpunodebind 0 236.451 ms numactl --membind 0 --cpunodebind 1 298.418 ms So roughly 20%. There's also a bigger difference in the perf, about 5944.3 MB/s vs. 5202.3 MB/s. > > Interestingly I do see a performance difference, albeit a smaller one, even > with OFFSET. I see similar numbers on two different 2 socket machines. > I wonder how significant is the number of sockets. The Azure is a single socket with 2 NUMA nodes, so maybe the latency differences are not significant enough to affect this kind of tests. The xeon is a 2-socket machine, but it's also older (~10y). regards -- Tomas Vondra
pgsql-hackers by date: