Re: Optimize LISTEN/NOTIFY - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Optimize LISTEN/NOTIFY
Date
Msg-id 803584.1768503335@sss.pgh.pa.us
Whole thread Raw
In response to Re: Optimize LISTEN/NOTIFY  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I wrote:
> Fascinating.  I was doing my testing on Intel (RHEL8).  I'd bet a good
> deal that this is more about the OS than the hardware.  I wonder if
> newer Linux versions behave differently.
> I can try to reproduce your results tomorrow on macOS (M4 Pro chip).

It does seem like macOS behaves noticeably differently than Linux.

On a 2024 Mac Mini M4 Pro, running current macOS (Tahoe 26.2),
I get results like these from your preferred test case:

$ ./pg_async_notify_test --listeners 1 --notifiers 1 --channels 1000 --sleep 0.01 --sleep-exp 1.01 --duration 10

today's HEAD:
10 s: 6211 sent (657/s), 6217 received (566/s)
Notification Latency Distribution:
 0.00-0.01ms                0 (0.0%) avg: 0.000ms
 0.01-0.10ms                0 (0.0%) avg: 0.000ms
 0.10-1.00ms                0 (0.0%) avg: 0.000ms
 1.00-10.00ms               0 (0.0%) avg: 0.000ms
 10.00-100.00ms  #          77 (1.2%) avg: 58.837mss
>100.00ms       #########  6140 (98.8%) avg: 725.985ms

v34 patch:
10 s: 143733 sent (14618/s), 143733 received (14630/s)
Notification Latency Distribution:
 0.00-0.01ms                0 (0.0%) avg: 0.000ms
 0.01-0.10ms     #          11384 (7.9%) avg: 0.084ms
 0.10-1.00ms     ######     93357 (65.0%) avg: 0.266ms
 1.00-10.00ms    ##         36566 (25.4%) avg: 3.525ms
 10.00-100.00ms  #          2426 (1.7%) avg: 20.367mss
>100.00ms                  0 (0.0%) avg: 0.000ms

v35 patch:
10 s: 43872 sent (4652/s), 43870 received (4651/s)
Notification Latency Distribution:
 0.00-0.01ms                0 (0.0%) avg: 0.000ms
 0.01-0.10ms     #          267 (0.6%) avg: 0.084ms
 0.10-1.00ms     #          2105 (4.8%) avg: 0.337mss
 1.00-10.00ms    #          3861 (8.8%) avg: 5.372mss
 10.00-100.00ms  #######    31756 (72.4%) avg: 51.454ms
>100.00ms       #          5881 (13.4%) avg: 117.087ms


But on my Intel workstation (Xeon W-2245, up-to-date RHEL 8)
it looks like this:

HEAD:
10 s: 15324 sent (1565/s), 15326 received (1567/s)
Notification Latency Distribution:
 0.00-0.01ms                0 (0.0%) avg: 0.000ms
 0.01-0.10ms                0 (0.0%) avg: 0.000ms
 0.10-1.00ms                0 (0.0%) avg: 0.000ms
 1.00-10.00ms    #          15 (0.1%) avg: 6.743ms
 10.00-100.00ms  #          92 (0.6%) avg: 56.650ms
>100.00ms       #########  15219 (99.3%) avg: 253.127ms

v34:
10 s: 198891 sent (20011/s), 198890 received (20010/s)
Notification Latency Distribution:
 0.00-0.01ms                0 (0.0%) avg: 0.000ms
 0.01-0.10ms     #######    158004 (79.4%) avg: 0.067ms
 0.10-1.00ms     #          39661 (19.9%) avg: 0.183ms
 1.00-10.00ms    #          1051 (0.5%) avg: 2.832ms
 10.00-100.00ms  #          175 (0.1%) avg: 14.321ms
>100.00ms                  0 (0.0%) avg: 0.000ms

v35:
10 s: 190192 sent (19966/s), 190192 received (19966/s)
Notification Latency Distribution:
 0.00-0.01ms                0 (0.0%) avg: 0.000ms
 0.01-0.10ms     #######    151957 (79.9%) avg: 0.063ms
 0.10-1.00ms     #          31932 (16.8%) avg: 0.207ms
 1.00-10.00ms    #          2693 (1.4%) avg: 3.768ms
 10.00-100.00ms  #          3610 (1.9%) avg: 25.191mss
>100.00ms                  0 (0.0%) avg: 0.000ms

This doesn't make a lot of sense if you compare the hardware specs:
the M4 Pro has more than double the geekbench ratings of the W-2245,
yet it runs these tests much more slowly.  I think perhaps there is
something about the way we do sleep/wakeup on macOS that is not as
well optimized as Linux.


Also, I was testing various cases that just go as fast as possible,
no --sleep.  On the Mac (ignoring plain HEAD, it's not in the
same league):

v34:

$ ./pg_async_notify_test --listeners 1 --notifiers 1 --channels 100 --duration 10
10 s: 674530 sent (67781/s), 674529 received (67781/s)
Notification Latency Distribution:
 0.00-0.01ms                0 (0.0%) avg: 0.000ms
 0.01-0.10ms                0 (0.0%) avg: 0.000ms
 0.10-1.00ms     #          1 (0.0%) avg: 0.572ms
 1.00-10.00ms    #########  674528 (100.0%) avg: 1.513ms
 10.00-100.00ms             0 (0.0%) avg: 0.000ms
>100.00ms                  0 (0.0%) avg: 0.000ms

$ ./pg_async_notify_test --listeners 10 --notifiers 1 --channels 100 --duration 10
10 s: 81097 sent (8015/s), 810962 received (80163/s)
Notification Latency Distribution:
 0.00-0.01ms                0 (0.0%) avg: 0.000ms
 0.01-0.10ms                0 (0.0%) avg: 0.000ms
 0.10-1.00ms     #          64 (0.0%) avg: 0.497ms
 1.00-10.00ms    ##         224131 (27.6%) avg: 9.105ms
 10.00-100.00ms  #######    586772 (72.4%) avg: 14.290ms
>100.00ms                  0 (0.0%) avg: 0.000ms

$ ./pg_async_notify_test --listeners 1 --notifiers 1 --channels 1000 --duration 10
10 s: 363167 sent (36772/s), 363167 received (36772/s)
Notification Latency Distribution:
 0.00-0.01ms                0 (0.0%) avg: 0.000ms
 0.01-0.10ms                0 (0.0%) avg: 0.000ms
 0.10-1.00ms                0 (0.0%) avg: 0.000ms
 1.00-10.00ms    #          2 (0.0%) avg: 5.912ms
 10.00-100.00ms  #########  362165 (99.7%) avg: 27.450ms
>100.00ms       #          1000 (0.3%) avg: 129.015ms

v35:

$ ./pg_async_notify_test --listeners 1 --notifiers 1 --channels 100 --duration 10
10 s: 707180 sent (70699/s), 707179 received (70698/s)
Notification Latency Distribution:
 0.00-0.01ms                0 (0.0%) avg: 0.000ms
 0.01-0.10ms     #          1 (0.0%) avg: 0.084ms
 0.10-1.00ms     #          43 (0.0%) avg: 0.862ms
 1.00-10.00ms    #########  707135 (100.0%) avg: 1.441ms
 10.00-100.00ms             0 (0.0%) avg: 0.000ms
>100.00ms                  0 (0.0%) avg: 0.000ms

$ ./pg_async_notify_test --listeners 10 --notifiers 1 --channels 100 --duration 10
10 s: 76301 sent (7632/s), 763000 received (76325/s)
Notification Latency Distribution:
 0.00-0.01ms                0 (0.0%) avg: 0.000ms
 0.01-0.10ms                0 (0.0%) avg: 0.000ms
 0.10-1.00ms     #          28 (0.0%) avg: 0.886ms
 1.00-10.00ms    #          91294 (12.0%) avg: 9.351ms
 10.00-100.00ms  ########   671680 (88.0%) avg: 13.953ms
>100.00ms                  0 (0.0%) avg: 0.000ms

$ ./pg_async_notify_test --listeners 1 --notifiers 1 --channels 1000 --duration 10
10 s: 313326 sent (36772/s), 313325 received (36771/s)
Notification Latency Distribution:
 0.00-0.01ms                0 (0.0%) avg: 0.000ms
 0.01-0.10ms                0 (0.0%) avg: 0.000ms
 0.10-1.00ms                0 (0.0%) avg: 0.000ms
 1.00-10.00ms               0 (0.0%) avg: 0.000ms
 10.00-100.00ms  #########  302951 (96.7%) avg: 27.686ms
>100.00ms       #          10375 (3.3%) avg: 164.892mss


But on the Linux box:

v34:

$ ./pg_async_notify_test --listeners 1 --notifiers 1 --channels 100 --duration 10
10 s: 914323 sent (92641/s), 914322 received (92641/s)
Notification Latency Distribution:
 0.00-0.01ms                0 (0.0%) avg: 0.000ms
 0.01-0.10ms     #          13 (0.0%) avg: 0.045ms
 0.10-1.00ms     #          17 (0.0%) avg: 0.598ms
 1.00-10.00ms    #########  914292 (100.0%) avg: 1.105ms
 10.00-100.00ms             0 (0.0%) avg: 0.000ms
>100.00ms                  0 (0.0%) avg: 0.000ms

$ ./pg_async_notify_test --listeners 10 --notifiers 1 --channels 100 --duration 10
10 s: 205412 sent (20655/s), 2054112 received (206550/s)
Notification Latency Distribution:
 0.00-0.01ms                0 (0.0%) avg: 0.000ms
 0.01-0.10ms                0 (0.0%) avg: 0.000ms
 0.10-1.00ms     #          149 (0.0%) avg: 0.455ms
 1.00-10.00ms    #########  2053945 (100.0%) avg: 4.888ms
 10.00-100.00ms  #          18 (0.0%) avg: 10.430ms
>100.00ms                  0 (0.0%) avg: 0.000ms

$ ./pg_async_notify_test --listeners 1 --notifiers 1 --channels 1000 --duration 10
10 s: 331651 sent (33458/s), 331649 received (33458/s)
Notification Latency Distribution:
 0.00-0.01ms                0 (0.0%) avg: 0.000ms
 0.01-0.10ms                0 (0.0%) avg: 0.000ms
 0.10-1.00ms     #          1 (0.0%) avg: 0.999ms
 1.00-10.00ms    #          1 (0.0%) avg: 1.035ms
 10.00-100.00ms  #########  331648 (100.0%) avg: 30.134ms
>100.00ms                  0 (0.0%) avg: 0.000ms

v35:

$ ./pg_async_notify_test --listeners 1 --notifiers 1 --channels 100 --duration 10
10 s: 940448 sent (95016/s), 940447 received (95016/s)
Notification Latency Distribution:
 0.00-0.01ms                0 (0.0%) avg: 0.000ms
 0.01-0.10ms     #          11 (0.0%) avg: 0.075ms
 0.10-1.00ms     #          557 (0.1%) avg: 0.660ms
 1.00-10.00ms    #########  939880 (99.9%) avg: 1.074ms
 10.00-100.00ms             0 (0.0%) avg: 0.000ms
>100.00ms                  0 (0.0%) avg: 0.000ms

$ ./pg_async_notify_test --listeners 10 --notifiers 1 --channels 100 --duration 10
10 s: 208430 sent (21033/s), 2084298 received (210329/s)
Notification Latency Distribution:
 0.00-0.01ms                0 (0.0%) avg: 0.000ms
 0.01-0.10ms     #          121 (0.0%) avg: 0.071ms
 0.10-1.00ms     #          455 (0.0%) avg: 0.493ms
 1.00-10.00ms    #########  2082795 (99.9%) avg: 4.813ms
 10.00-100.00ms  #          929 (0.0%) avg: 11.882ms
>100.00ms                  0 (0.0%) avg: 0.000ms

$ ./pg_async_notify_test --listeners 1 --notifiers 1 --channels 1000 --duration 10
10 s: 351154 sent (35975/s), 351153 received (35976/s)
Notification Latency Distribution:
 0.00-0.01ms                0 (0.0%) avg: 0.000ms
 0.01-0.10ms                0 (0.0%) avg: 0.000ms
 0.10-1.00ms                0 (0.0%) avg: 0.000ms
 1.00-10.00ms               0 (0.0%) avg: 0.000ms
 10.00-100.00ms  ########## 351154 (100.0%) avg: 28.460ms
>100.00ms                  0 (0.0%) avg: 0.000ms


Also, I looked at "perf" results for the as-fast-as-possible
runs, and was interested to see that the directly notify-related
logic accounts for only 10%-15% of the runtime.  The rest is going
into generic transaction housekeeping, client I/O, kernel overhead,
and so on.  So that bolsters my feeling that we should be minimizing
process wakeups rather than trying to optimize anything more in
the notify processing itself.

Anyway, at this point I'm content to go ahead with v35, and
I'll push that in a little bit.  Perhaps we should take a TODO
to figure out why this test scenario runs so poorly on macOS;
but I'll bet that the answer is not anywhere near async.c itself.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Kirill Reshke
Date:
Subject: Re: io_uring: Fix danger of completion getting reused before being read
Next
From: Andres Freund
Date:
Subject: Re: Client-only Meson Build From Sources