Re: postmaster uses more CPU in 18 beta1 with io_method=io_uring - Mailing list pgsql-hackers

From Andres Freund
Subject Re: postmaster uses more CPU in 18 beta1 with io_method=io_uring
Date
Msg-id lplmrf2al3isx2e26djwxeygmvgf5yf4ksqtgjhgemnpy2hmlu@gw2xt6rfwomd
Whole thread Raw
In response to Re: postmaster uses more CPU in 18 beta1 with io_method=io_uring  (Robert Treat <rob@xzilla.net>)
Responses Re: postmaster uses more CPU in 18 beta1 with io_method=io_uring
List pgsql-hackers
Hi,

On 2025-09-06 09:12:19 -0400, Robert Treat wrote:
> On Tue, Aug 26, 2025 at 9:32 AM Jakub Wartak
> <jakub.wartak@enterprisedb.com> wrote:
> > On Tue, Jul 8, 2025 at 5:22 AM Andres Freund <andres@anarazel.de> wrote:
> > > On 2025-06-30 12:27:10 -0400, Andres Freund wrote:
> > > After addressing most of Greg's and Jim's feedback, I pushed this. I chose not
> > > to increase the log level as Jim suggested, but if we end up deciding that
> > > that's the way to go, we can easily change that...
> >
> > I'm with Jim as I've just hit it but not on exit() but for fork(), so:
> >
> > 1. Could we s/DEBUG1/INFO/ that debug message level? (for those two:
> > "cannot use combined memory mapping for io_uring" , and maybe add
> > "potential slow new connections" there too along the way?)
> > 2. Maybe we could add some wording to the docs about io_method that it
> > might cause such trouble ?
> >
> > Just wasted an hour on wondering why $stuff is slow, given:
> >     max_connections = '20000' # yes, yay..
> >     io_method = 'io_uring'
> >
> > I was getting like slow fork()/clone() performance when there's were
> > lots of io_uring fds/instances in the main postmaster:
> >     $ /usr/pgsql19/bin/pgbench -f select1.sql -c 1000 -j 1 -t 1 -P 1
> >     [..]
> >     progress: 39.7 s, 0.0 tps, lat 0.000 ms stddev 0.000, 0 failed
> >     progress: 40.6 s, 1039.9 tps, lat 407.696 ms stddev 291.856, 0 failed
> >     [..]
> >     initial connection time = 39632.164 ms
> >     tps = 1015.608893 (without initial connection time)
> >
> > So yes, ~40s to just connect to the database and I was using some old
> > branch from back before Jun (it was not having f54af9f2679d5987b46),
> > so simulating <= 6.5 as You say more or less. I was limited to 20-30
> > forks()/1sec according to bpftrace. It goes away with default
> > io_method (~800 forks()/1sec). With max_connections = 2k, I got 5s
> > initial connection times. It looked like caused by io_uring, as with
> > io_uring fork() was slow somewhere in vma_interval_tree_insert_after
> > <- copy_process <- kernel_clone <- __do_sys_clone <- do_syscall_64
> > (?). I've tested it on 6.14.17 too, but also on LTS 6.1.x too (well
> > the difference is that it takes 65s instead of 40s...). Then searched
> > and hit this thread, but 6.1 is the LTS kernel, so plenty of people
> > are going to hit those regressions with io_uring io_method, won't
> > they?

I doubt it, but who knows.


> > I can try to prepare a patch, please just let me know.

Yes, please do.


> Did anything ever happen with this?

No.  I missed the email. So thanks for the reminder.


> I do think it would be helpful to make some of these pot-holes more user
> visible / discoverable.

> I have a suspicion that we're going to see people using pre-built packages
> with io_uring support installed on to older kernels they are still hanging
> on to because pg_upgrade was the easiest path, but that they could either
> update the kernel or upgrade via logical replication to get the new
> functionality if they knew about it.

If they just upgrade in-place, they won't use io_uring. And they won't simply
use io_uring with this large max_connections without also tuning the file
descriptor limits...

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: issue with synchronized_standby_slots
Next
From: Andres Freund
Date:
Subject: Re: PgStat_HashKey padding issue when passed by reference