Re: java.lang.OutOfMemoryError - Mailing list pgsql-jdbc
From | Matt Mello |
---|---|
Subject | Re: java.lang.OutOfMemoryError |
Date | |
Msg-id | 3E3F475A.1050302@spaceship.com Whole thread Raw |
In response to | java.lang.OutOfMemoryError (Paul Cullum <pcullum@N-able.com>) |
Responses |
Re: java.lang.OutOfMemoryError
Re: java.lang.OutOfMemoryError |
List | pgsql-jdbc |
Paul Cullum wrote: > What > is the proper method for using JDBC to query result sets which are > potentially large? I had a similar problem, but I never got an exception. I will bring that up in another thread. I didn't see anyone else more knowledgable respond to this, so I will give it a shot. From what I have read in the JDBC driver, it pulls the entire resultset across the socket, caching it locally in a vector of rows [or something to that effect]. I, too, used a driver that fetched at need [Informix], and not all at execution time. It was very nice. :) From what I have seen on this maillist, there are possibly some developers working on enhancing the drivers to allow fetch-at-need [however, I'm not sure if the backend supports it]. Anyway, I believe the current method for handling this is to use the LIMIT/OFFSET clause in your PG SQL command. See: http://www.ca.postgresql.org/users-lounge/docs/7.3/postgres/sql-select.html#SQL-LIMIT With that clause, you can control exactly which records you want for the query. Problems with this? Possibly... I'm thinking that since every "chunk" you ask for is actually a different query, your results might not be properly contiguous between queries. For example, someone could insert or delete a record that will or would have been in the middle of the data you are looking for. This could cause your next "chunk" to contain a duplicate record from the first chunk (in the case of an insert by someone else), or to skip a record (in the case of a deletion). It may be possible to control this situation by doing all chunk queries in a single transaction, but I'm not sure. Perhaps someone more knowledgable can talk about that. For those of us converting from other jdbc drivers that DID fetch-at-need, this is truly troublesome. We may have coded more lax systems that allow users to execute queries that could crash our entire VM under PG-JDBC! [I have a screen that allows the user to input several variables for a search query. It is impossible at times for me to know how large the resultset might be.] If you give your users ways to query your system, then with PG, you have to be very careful how much power they are given. Sometimes it is very hard to prevent large queries from squeaking by. This makes a fully-read-execution jdbc driver a deadly weapon. Anybody have any ideas about what we can do to speed the development of a fetch-at-need jdbc driver? Can we patch / extend the driver to allow us to set an absolute maximum resultset length on a query and alert us when it would have been exceeded WITHOUT it causing our system to go haywire / crash / get OOME? -- Matt Mello
pgsql-jdbc by date: