On 08/08/2025 17:01, Tom Lane wrote:
> Heikki Linnakangas <hlinnaka@iki.fi> writes:
>> I noticed that there's a similar, existing case in getNotify(), where
>> libpq just hangs if an allocation fails. To simulate that, apply this
>> change and use LISTEN/NOTIFY:
>
>> --- a/src/interfaces/libpq/fe-protocol3.c
>> +++ b/src/interfaces/libpq/fe-protocol3.c
>> @@ -1587,7 +1587,7 @@ getNotify(PGconn *conn)
>> if (pqGets(&conn->workBuffer, conn))
>> return EOF;
>> /* must save name while getting extra string */
>> - svname = strdup(conn->workBuffer.data);
>> + svname = NULL;
>> if (!svname)
>> return EOF;
>> if (pqGets(&conn->workBuffer, conn))
>
>> Returning EOF means "not enough data", which is wrong here just like in
>> getBackendKeyData().
>
> The implication of "return EOF" is "try again later", which seems like
> about the best thing we can do.
We could:
a) report an error,
b) silently discard the async notification and move one, or
c) disconnect.
There's another malloc() call in the same getNotify() function, and if
that fails, we silently discard the notification and move on (i.e.
option b). So it's inconsistent.
The current behavior is "hang until more data is received from the
server, then try again". I think any of the other options would be
better. There's no guarantee that more data will ever arrive, the
connection might be used just to wait for the notification.
> If memory serves, other places that construct query results do
> likewise.
It's a mix. Most functions that are related to queries, e.g.
getRowDescriptions(), construct an error result object. If we run out of
space in allocating the input buffer in pqCheckInBufferSpace(), we
disconnect.
Another weird error handling: functions like pqGetInt() and pqGets()
don't check that you don't read past the end of the message. They do
check that they don't read past the input buffer, but the input buffer
might contain another message after the current one, and the functions
will merrily read past the message boundary. We do check for that later,
report the "message contents do not agree with length ..." message, and
resync. But it's weird that we don't check for that earlier already.
- Heikki