Re: Optimize LISTEN/NOTIFY - Mailing list pgsql-hackers
| From | Joel Jacobson |
|---|---|
| Subject | Re: Optimize LISTEN/NOTIFY |
| Date | |
| Msg-id | 18521cfd-830e-4e8f-bcb0-eacee535480a@app.fastmail.com Whole thread Raw |
| In response to | Re: Optimize LISTEN/NOTIFY (Tom Lane <tgl@sss.pgh.pa.us>) |
| Responses |
Re: Optimize LISTEN/NOTIFY
|
| List | pgsql-hackers |
On Thu, Jan 15, 2026, at 00:09, Tom Lane wrote:
> I wrote:
>> I think that if we have a backend that isn't interested in our
>> notifications, and we can't direct-advance it, we should apply
>> the same behavior that was previously used for backends in other
>> databases. That was basically a conservative approximation to
>> "isn't interested", and I don't see why it wouldn't work fine
>> when we have a more accurate idea of "isn't interested".
>
> I spent some time trying to measure the impact of that point,
> by modifying the test program you posted upthread so that
> some notifiers go at full speed while others respond to the
> rate-limit switch so that they can be made to go slowly.
> I couldn't really see any difference between what you have in v34
> and doing this the old way. Maybe I just failed to construct the
> right test case to stress this behavior. But anyway, without
> concrete evidence for a change I think we should stick to the
> old behavior. When that was put in, it was to ameliorate a
> demonstrable "thundering herd" problem, and it seems likely to me
> that we'll bring that back for some usage patterns if we eagerly
> awaken backends that don't really need to do anything right away.
I reran the old benchmark [1] and got almost identical results as before
on my MacBook Pro M3 Max, when I tested v34 against patching v34 with
adding back the QUEUE_CLEANUP_DELAY logic:
```diff
@@ -2244,6 +2256,16 @@ SignalBackends(void)
QUEUE_POS_PRECEDES(QUEUE_BACKEND_ADVANCING_POS(i), queueHeadAfterWrite) :
QUEUE_POS_PRECEDES(pos, queueHeadBeforeWrite))
{
+ /*
+ * This backend isn't interested in our notifications. Rather
+ * than wake it repeatedly to skip over messages it doesn't
+ * care about, let the work accumulate so it can be done in
+ * larger, more efficient batches.
+ */
+ if (asyncQueuePageDiff(QUEUE_POS_PAGE(QUEUE_HEAD),
+ QUEUE_POS_PAGE(pos)) < QUEUE_CLEANUP_DELAY)
+ continue;
+
Assert(pid != InvalidPid);
QUEUE_BACKEND_WAKEUP_PENDING(i) = true;
```
# MacBook Pro M3 Max
*** v34:
./pg_async_notify_test --listeners 1 --notifiers 1 --channels 1000 --sleep 0.01 --sleep-exp 1.01
10 s: 155527 sent (15705/s), 155527 received (15705/s)
Notification Latency Distribution:
0.00-0.01ms 0 (0.0%) avg: 0.000ms
0.01-0.10ms # 25110 (16.1%) avg: 0.080ms
0.10-1.00ms ##### 78523 (50.5%) avg: 0.218ms
1.00-10.00ms ### 46798 (30.1%) avg: 4.315ms
10.00-100.00ms # 5096 (3.3%) avg: 13.002ms
>100.00ms 0 (0.0%) avg: 0.000ms
*** v34 with QUEUE_CLEANUP_DELAY patch:
./pg_async_notify_test --listeners 1 --notifiers 1 --channels 1000 --sleep 0.01 --sleep-exp 1.01
10 s: 44410 sent (5017/s), 44411 received (5018/s)
Notification Latency Distribution:
0.00-0.01ms 0 (0.0%) avg: 0.000ms
0.01-0.10ms # 1289 (2.9%) avg: 0.082ms
0.10-1.00ms # 5588 (12.6%) avg: 0.262ms
1.00-10.00ms # 5155 (11.6%) avg: 4.807ms
10.00-100.00ms ##### 23573 (53.1%) avg: 48.444ms
>100.00ms # 8806 (19.8%) avg: 128.657ms
I repeated this three times and got very similar distributions and
averages.
However, I completely failed to reproduce this difference on my Intel
and AMD machines!
# AMD Ryzen 9 7950X3D 16-Core Processor:
*** v34:
./pg_async_notify_test --listeners 1 --notifiers 1 --channels 1000 --sleep 0.01 --sleep-exp 1.01
10 s: 199123 sent (19948/s), 199123 received (19948/s)
Notification Latency Distribution:
0.00-0.01ms 0 (0.0%) avg: 0.000ms
0.01-0.10ms ######### 192536 (96.7%) avg: 0.050ms
0.10-1.00ms # 4577 (2.3%) avg: 0.264ms
1.00-10.00ms # 1806 (0.9%) avg: 3.304ms
10.00-100.00ms # 204 (0.1%) avg: 12.579ms
>100.00ms 0 (0.0%) avg: 0.000ms
*** v34 with QUEUE_CLEANUP_DELAY patch:
./pg_async_notify_test --listeners 1 --notifiers 1 --channels 1000 --sleep 0.01 --sleep-exp 1.01
10 s: 185687 sent (19267/s), 185686 received (19266/s)
Notification Latency Distribution:
0.00-0.01ms 0 (0.0%) avg: 0.000ms
0.01-0.10ms ######### 168090 (90.5%) avg: 0.052ms
0.10-1.00ms # 6112 (3.3%) avg: 0.194ms
1.00-10.00ms # 5818 (3.1%) avg: 5.823ms
10.00-100.00ms # 5666 (3.1%) avg: 15.340ms
>100.00ms 0 (0.0%) avg: 0.000ms
# Intel(R) Core(TM) i9-14900K
*** v34:
./pg_async_notify_test --listeners 1 --notifiers 1 --channels 1000 --sleep 0.01 --sleep-exp 1.01
10 s: 107836 sent (11367/s), 107836 received (11370/s)
Notification Latency Distribution:
0.00-0.01ms 0 (0.0%) avg: 0.000ms
0.01-0.10ms # 130 (0.1%) avg: 0.060ms
0.10-1.00ms # 269 (0.2%) avg: 0.227ms
1.00-10.00ms ###### 66886 (62.0%) avg: 8.115ms
10.00-100.00ms ### 40551 (37.6%) avg: 15.211ms
>100.00ms 0 (0.0%) avg: 0.000ms
*** v34 with QUEUE_CLEANUP_DELAY patch:
./pg_async_notify_test --listeners 1 --notifiers 1 --channels 1000 --sleep 0.01 --sleep-exp 1.01
10 s: 97480 sent (10765/s), 97480 received (10764/s)
Notification Latency Distribution:
0.00-0.01ms 0 (0.0%) avg: 0.000ms
0.01-0.10ms # 9980 (10.2%) avg: 0.065ms
0.10-1.00ms # 15703 (16.1%) avg: 0.352ms
1.00-10.00ms #### 47123 (48.3%) avg: 5.831ms
10.00-100.00ms ## 24674 (25.3%) avg: 44.263ms
>100.00ms 0 (0.0%) avg: 0.000ms
I have no idea what could explain the difference on my M3 Max. Not sure
if it's due to macOS or due to the aarch64 CPU. It's still much faster
than master, so I think this is fine, we can always come back to this in
the future, if there is evidence this is not just an edge-case. Maybe
I'll do some benchmarking on a aarch64 server for fun, to see if it's
due to macOS or aarch64, or something else entirely.
I therefore agree with your change of bringing back the "wake laggers"
logic, even though it could possibly cause a few listening backends to
receive their notifications a bit later than they otherwise would.
Hopefully the spared context switches will allow a bit more throughput,
to make up for the increased delivery time of a few notifications.
> So, here's a reworked v35, which also incorporates quite a lot
> of cosmetic modifications as well as some bug fixes (mostly
> to do with being sure we recover from foreseeble problems like
> OOM partway through commit). I think this is pretty close to
> committable, but if you see anything you don't like, let me know.
Nice improvements. I agree with your decision of simplifying by getting
rid of advancingPos, and simply not touch advancing backends. I couldn't
measure any significant regression due to this simplification, very
nice.
Benchmarks on all my three machines look really good, and that benchmark
also does correctness testing, verifying that all notifications sent are
delivered, in order, without gaps. All such tests pass.
/Joel
[1] https://www.postgresql.org/message-id/dab234b5-a10c-4fb5-a2b1-ce725d3e3020%40app.fastmail.com
pgsql-hackers by date: