Hi,
On 2025-06-03 12:24:38 -0700, MARK CALLAGHAN wrote:
> When measuring the time to create a connection, it is ~2.3X longer with
> io_method=io_uring then with io_method=sync (6.9ms vs 3ms), and the
> postmaster process uses ~3.5X more CPU to create connections.
I can reproduce that - the reason for the slowdown is that we create one
io_uring instance for each potential process, and the way we create them
creates one mmap()ed region for each potential process. That creates extra
overhead, particularly when child processes exit.
> The reproduction case so far is my usage of the Insert Benchmark on a large
> server with 48 cores. I need to fix the benchmark client -- today it
> creates ~1000 connections/s to run a monitoring query in between every 100
> queries and the extra latency from connection create makes results worse
> for one of the benchmark steps.
Heh, yea - 1000/connections sec will influence performance regardless of this issue.
> While I can fix the benchmark client to avoid this, I am curious about the
> extra latency in connection create.
>
> I used "perf record -e cycles -F 333 -g -p $pidof_postmaster -- sleep 30"
> but I have yet to find a big difference from the reports generated with
> that for io_method=io_uring vs =sync. It shows that much time is spent in
> the kernel dealing with the VM (page tables, etc).
I see a lot of additional time spent below
do_group_exit->do_exit->...->unmap_vmas
which fits the theory that this is due to the number of memory mappings.
There has been a bunch of discussion around this on mastodon, particularly
below [1] which ended in Jens prototyping that approach [2] where Jens pointed
out that we should use
https://man7.org/linux/man-pages/man3/io_uring_queue_init_mem.3.html to avoid
creating this many memory mappings.
There are a few complications around that though - only newer kernels (>=6.5)
support the caller providing the memory for the mapping and there isn't yet a
good way to figure out how much memory needs to be provided.
I think this is a big enough pitfall that it's, obviously assuming the patch
has a sensible complexity, worth fixing this in 18. RMT, anyone, what do you
think?
Greetings,
Andres Freund
[1] https://fosstodon.org/@axboe/114630982449670090
[2] https://pastebin.com/7M3C8aFH