Re: Add a property to automatically suspend portals as they produce given number of bytes - Mailing list pgsql-hackers

From Kirill Reshke
Subject Re: Add a property to automatically suspend portals as they produce given number of bytes
Date
Msg-id CALdSSPiZwxCX0N4C9iwBze17OYzGWYApPTKnAozn3H6fQJdrjw@mail.gmail.com
Whole thread Raw
In response to Add a property to automatically suspend portals as they produce given number of bytes  (Vladimir Sitnikov <sitnikov.vladimir@gmail.com>)
Responses Re: Add a property to automatically suspend portals as they produce given number of bytes
List pgsql-hackers

Hi, client can use CURSOR feature to process data in batches. What is the case where proposed feature solves problem that CURSOR does not?

https://www.postgresql.org/docs/current/plpgsql-cursors.html

On Fri, 17 Jan 2025, 16:08 Vladimir Sitnikov, <sitnikov.vladimir@gmail.com> wrote:
Hi,

Applications often face an "out of memory" condition as they try to fetch "N rows" from the database.
If N is small, then the execution becomes inefficient due to many roundtrips.
If N is high, there's a risk that many rows would overflow the client's memory.

Note: the client can't stop reading the data at arbitrary times as they might want to execute another query
while processing the resultset, so "fetch all rows at once and read them from the socket as needed" is not really an option.

The problem is the clients can't tell how much memory it would take to buffer the results of "fetch 100 rows".
The row size might vary.

Currently, the only case the database returns PortalSuspended seems to be when max_rows is reached when executing a select.
I wonder if the database can suspend portals earlier in case it knows it produced a lot of data.

In other words:
1) If the user calls execute(fetch all rows), then follow the current behavior. 
2) If the user provides non-zero max_rows, then optionally suspend the portal as the result exceeds a pre-configured amount (GUC?).

Then clients could configure the connection like "ok, please suspend portals if a single fetch exceeds 50MiB".
To my understanding,
* it should be a backward-compatible change
* it would require no protocol changes,
* it would make applications more robust when it comes to "out of memory while fetching many rows"
* it might improve the performance overall as the applications could increase their pessimistic "max_rows" fetch size

Is it ok to contribute a patch like that?

Vladimir

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Conflict detection for update_deleted in logical replication
Next
From: Richard Guo
Date:
Subject: Re: Eager aggregation, take 3