Thread: experience sharing: select query returns more records than necessary
To share with you: I am using a select statement to try to retrieve the first 50 records found in a large table by reading the result set one by one (up to 50). However, the query execution (the call to executeQuery in JDBC) takes quite a lot of time, eg, 10 seconds and quite a lot of records were returned as shown in the network packets captured. In contrast, if I issue the query on the DB server in the psql console, it returns records almost immediately. Finally I found that it is because the postgreSQL JDBC driver is pre-fetching a lot (all?) of the records. To fix the problem, one can call setFetchSize(50) on the statement. ----- -- Kent Tong Wicket tutorials freely available at http://www.agileskills2.org/EWDW Axis2 tutorials freely available at http://www.agileskills2.org/DWSAA -- View this message in context: http://www.nabble.com/experience-sharing%3A-select-query-returns-more-records-than-necessary-tp21556532p21556532.html Sent from the PostgreSQL - jdbc mailing list archive at Nabble.com.
In contrast, if I issue the query on the DB server in the psql console, it
returns records almost immediately. Finally I found that it is because the
postgreSQL JDBC driver is pre-fetching a lot (all?) of the records. To fix
the problem, one can call setFetchSize(50) on the statement.
psql doesnt do anything any differently. It just doesn't have to create objects, and you are likely local to the network.
And yes, setFetchSize is the correct way to handle this.
Dave
And yes, setFetchSize is the correct way to handle this.
Dave
Dave Cramer wrote: > > > > In contrast, if I issue the query on the DB server in the psql > console, it > returns records almost immediately. Finally I found that it is > because the > postgreSQL JDBC driver is pre-fetching a lot (all?) of the records. > To fix > the problem, one can call setFetchSize(50) on the statement. > > psql doesnt do anything any differently. It just doesn't have to create > objects, and you are likely local to the network. Well, not exactly. The difference is that psql begins outputting as soon as the first result arrives, while the driver (in non-cursor mode) gathers the entire resultset before returning anything to the application. So while the overall query execution time is going to be similar (excepting object creation, etc), the query *latency* is higher with JDBC. Though, I would suggest that if you really only care about the first 50 results, then put a LIMIT 50 in your query! The query planner may come up with a better plan if you do that, too .. If you want to "stream" the whole resultset, then fetchsize is certainly the way to do it. -O
Oliver Jowett wrote: > > Dave Cramer wrote: > Though, I would suggest that if you really only care about the first 50 > results, then put a LIMIT 50 in your query! The query planner may come > up with a better plan if you do that, too .. > > If you want to "stream" the whole resultset, then fetchsize is certainly > the way to do it. > Yeah, I was actually settling on "LIMIT 50" before coming across setFetchSize(). However, the support for LIMIT varies from one DBMS to another. To keep the code as portable as possible, I decided to go with setFetchSize(). ----- -- Kent Tong Wicket tutorials freely available at http://www.agileskills2.org/EWDW Axis2 tutorials freely available at http://www.agileskills2.org/DWSAA -- View this message in context: http://www.nabble.com/experience-sharing%3A-select-query-returns-more-records-than-necessary-tp21556532p21564410.html Sent from the PostgreSQL - jdbc mailing list archive at Nabble.com.
On Tue, Jan 20, 2009 at 9:43 AM, Kent Tong <kent@cpttm.org.mo> wrote:
Check the difference in the plans. Using limit can invoke an index that might not otherwise be used. It can make orders of magnitude differences in these kinds of queries.
Dave
Oliver Jowett wrote:>Yeah, I was actually settling on "LIMIT 50" before coming across
> Dave Cramer wrote:
> Though, I would suggest that if you really only care about the first 50
> results, then put a LIMIT 50 in your query! The query planner may come
> up with a better plan if you do that, too ..
>
> If you want to "stream" the whole resultset, then fetchsize is certainly
> the way to do it.
>
setFetchSize().
However, the support for LIMIT varies from one DBMS to another. To keep
the code as portable as possible, I decided to go with setFetchSize().
Kent,
Check the difference in the plans. Using limit can invoke an index that might not otherwise be used. It can make orders of magnitude differences in these kinds of queries.
Dave