Streaming ResultSet - Mailing list pgsql-jdbc
From | Kevin Grittner |
---|---|
Subject | Streaming ResultSet |
Date | |
Msg-id | s2dd1ab9.027@gwmta.wicourts.gov Whole thread Raw |
Responses |
Re: Streaming ResultSet
Re: Streaming ResultSet Re: Streaming ResultSet Re: Streaming ResultSet Re: [JDBC] Streaming ResultSet Re: Streaming ResultSet |
List | pgsql-jdbc |
Hello all. Within the PostgreSQL JDBC driver a ResultSet is, by default, streamed back to the client as the rows are generated by the server, and are stored in a collection in RAM. An attempt to get the first row from the ResultSet blocks until every last row is stored in RAM. If various conditions are met, a cursor is used instead, which allows a large result set to be processed without running out of RAM. I'm curious about the reasons that this appoach was taken -- why not return the rows as they are retrieved from the server? I can think of some possible explanations: (1) The time locks are held on the server is minimized by the default behavior. (2) If an exception occurs, you do not receive that data up to the point of the problem. (For example, divide by zero, serialization (deadlock) problems, or broken connections.) (3) The connection can be used for other purposes without conflict. (Other drivers I've used, which normall stream the results fall back on reading the whole result set into RAM when a statement is issued which cannot be run while the result set is pending.) The disadvantages of the current default approach are: (1) The entire result set might not fit in the client RAM. (2) You might want to see partial results from a query which ultimately generates an exception. (3) You can't overlap retrieval of the result set with processing of the rows. The disadvantages of the cursored approach are: (1) It might not be easy to create all of the right conditions for the cursored approach to work. (2) You might think you've met all the conditions for the cursored approach and be wrong. The default behavior might kick in and cause an OutOfMemoryError. (3) There is overhead to creating a cursor for a select statement. Granted, in my tests it was only about 20 microseconds, but that boosted runtime for my simple test case by 3%. If there was a connection property which allowed result sets to return rows as they are retrieved, I would use it. I'm not lobbying very hard for it, however, since it is only 20 microseconds per SELECT statement to use cursors, and I can easily modify my code to use them, although it will have to be an "all or nothing" change in the framework, so the cost is paid on every SELECT statement. I'm just expressing interest, should the feature be under consideration. -Kevin
pgsql-jdbc by date: