Re: Optimize LISTEN/NOTIFY - Mailing list pgsql-hackers

From Joel Jacobson
Subject Re: Optimize LISTEN/NOTIFY
Date
Msg-id 2790345f-03c3-4cae-8f14-886ed9079319@app.fastmail.com
Whole thread Raw
In response to Re: Optimize LISTEN/NOTIFY  (Chao Li <li.evan.chao@gmail.com>)
Responses Re: Optimize LISTEN/NOTIFY
List pgsql-hackers
On Wed, Oct 8, 2025, at 05:43, Chao Li wrote:
> After several rounds of reviewing, the code is already very good. I
> just got a few small comments:

Thanks for feedback!

The below changes have been incorporated into the v12 version
sent in my previous email.

>> On Oct 8, 2025, at 03:26, Joel Jacobson <joel@compiler.org> wrote:
>>
>>
>> /Joel<optimize_listen_notify-v11.patch>
>
>
> 1
> ```
> + channels = GetPendingNotifyChannels();
> +
>  LWLockAcquire(NotifyQueueLock, LW_EXCLUSIVE);
> - for (ProcNumber i = QUEUE_FIRST_LISTENER; i != INVALID_PROC_NUMBER; i
> = QUEUE_NEXT_LISTENER(i))
> + foreach(lc, channels)
> ```
>
> I don’t see where “channels” is freed. GetPendingNotifyChannels()
> creates a list of Nodes, both the list and Nodes the list points to
> should be freed.

Per suggestion from Tom Lane I reverted back GetPendingNotifyChannels(),
so this comment is not applicable any longer.

> 2
> ```
> + foreach(lc, channels)
>  {
> - int32 pid = QUEUE_BACKEND_PID(i);
> - QueuePosition pos;
> + char   *channel = strVal(lfirst(lc));
> + ChannelEntry *entry;
> + ProcNumber *listeners;
> + ChannelHashKey key;
>
> - Assert(pid != InvalidPid);
> - pos = QUEUE_BACKEND_POS(i);
> - if (QUEUE_BACKEND_DBOID(i) == MyDatabaseId)
> + if (channel_hash == NULL)
> + entry = NULL;
> + else
> ```
>
> I wonder whether or not “channel_hash” can be NULL here? Maybe possible
> if a channel is un-listened while the event is pending?

Yes, I think channelHash can be NULL here if doing a NOTIFY
when there hasn't been a LISTEN yet.

> So, maybe add a comment here to explain the logic.

Not sure I think that's necessary.
What do you suggest that comment would say?

> 3
> The same piece of code as 2.
>
> I think the code can be optimized a little bit. First, we can
> initialize entry to NULL, then we don’t the if-else. Second, “key” is
> only used for dshash_find(), so it can defined where it is used.
>
> foreach(lc, channels)
> {
> char *channel = strVal(lfirst(lc));
> ChannelEntry *entry = NULL;
> ProcNumber *listeners;
> //ChannelHashKey key;
>
> if (channel_hash != NULL)
> {
> ChannelHashKey key;
> ChannelHashPrepareKey(&key, MyDatabaseId, channel);
> entry = dshash_find(channel_hash, &key, false);
> }
>
> if (entry == NULL)
> continue; /* No listeners registered for this channel */

Nice, I agree that's more readable, I changed it like that.

> 4
> ```
> + if (signaled[i] || QUEUE_BACKEND_WAKEUP_PENDING(i))
> + continue;
> ```
>
> I wonder if “signaled[i]” is a duplicate flag of
> "QUEUE_BACKEND_WAKEUP_PENDING(i)”?
>
> I understand signaled is local, and QUEUE_BACKEND_WAKEUP_PENDING is in
> shared memory and may be set by other processes, but in local, when
> signaled[I] is set, QUEUE_BACKEND_WAKEUP_PENDING(i) is also set. And
> because of NotifyQueueLock, other process should not be able to cleanup
> the flag.
>
> But if “signals” is really needed, maybe we can use Bitmapset
> (src/backend/nodes/bitmapset.c), that would use 1/8 of memories
> comparing to the bool array.

I agree, since we're holding an exclusive lock, the signaled array is reundant.
I've removed it, so that we rely only on the wakeupPending flag.

> 5
> ```
>  /*
> @@ -1865,6 +2087,7 @@ asyncQueueReadAllNotifications(void)
>  LWLockAcquire(NotifyQueueLock, LW_SHARED);
>  /* Assert checks that we have a valid state entry */
>  Assert(MyProcPid == QUEUE_BACKEND_PID(MyProcNumber));
> + QUEUE_BACKEND_WAKEUP_PENDING(MyProcNumber) = false;
> ```
>
> This piece of code originally only read the shared memory, so it can
> use LW_SHARED lock mode, but now it writes to the shared memory, do we
> need to change the lock mode to “exclusive”?

No, LW_SHARED is sufficient here, since the backend only modifies its own state,
and no other backend could do that, without holding an exclusive lock.

> 6
> ```
> +static inline void
> +ChannelHashPrepareKey(ChannelHashKey *key, Oid dboid, const char *channel)
> +{
> + memset(key, 0, sizeof(ChannelHashKey));
> + key->dboid = dboid;
> + strlcpy(key->channel, channel, NAMEDATALEN);
> +}
> ```
>
> Do we really need the memset()? If “channel” is of length NAMEDATALEN,
> then it still results in a non-0 terminated key->channel; if channel is
> shorter than NAMEDATALEN, strlcpy will auto add a tailing ‘\0’. I think
> previous code should have ensured length of channel should be less than
> NAMEDATALEN.

Yes, I think we need memset, since I fear that when the hash table keys
are compared, every byte of the struct might be inspected, so without
zero-initializing it, there could be unused bytes after the null
terminator, that could then cause logically identical keys to be wrongly
considered different.

I haven't checked the implementation though, but my gut feeling says
it's better to be a bit paranoid here.

> 7
> ```
>   *
>   * Resist the temptation to make this really large.  While that would save
>   * work in some places, it would add cost in others.  In particular, this
> @@ -246,6 +280,7 @@ typedef struct QueueBackendStatus
>  Oid dboid; /* backend's database OID, or InvalidOid */
>  ProcNumber nextListener; /* id of next listener, or INVALID_PROC_NUMBER */
>  QueuePosition pos; /* backend has read queue up to here */
> + bool wakeup_pending; /* signal sent but not yet processed */
>  } QueueBackendStatus;
> ```
>
> In the same structure, rest of fields are all in camel case, I think
> it’s better to rename the new field to “wakeupPending”.
>
> 8
> ```
> @@ -288,11 +323,91 @@ typedef struct AsyncQueueControl
>  ProcNumber firstListener; /* id of first listener, or
>   * INVALID_PROC_NUMBER */
>  TimestampTz lastQueueFillWarn; /* time of last queue-full msg */
> + dsa_handle channel_hash_dsa;
> + dshash_table_handle channel_hash_dsh;
>  QueueBackendStatus backend[FLEXIBLE_ARRAY_MEMBER];
> ```
>
> Same as 7, but in this case, type names are not camel case, maybe okay
> for field names. I don’t have a strong opinion here.

I've did a major renaming of all new code, to better match the casing style.
It seems like helper functions and fields areNamedLikeThis, while
API-functions AreNamedLikeThis.

If we don't like this naming, I'm happy to change it again, please advise.

/Joel



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Eager aggregation, take 3
Next
From: Ants Aasma
Date:
Subject: Re: Should we update the random_page_cost default value?