Thread: experience sharing: select query returns more records than necessary

experience sharing: select query returns more records than necessary

From
Kent Tong
Date:
To share with you:

I am using a select statement to try to retrieve the first 50 records found
in a large table by reading the result set one by one (up to 50). However,
the query execution (the call to executeQuery in JDBC) takes quite a lot of
time, eg, 10 seconds and quite a lot of records were returned as shown in
the network packets captured.

In contrast, if I issue the query on the DB server in the psql console, it
returns records almost immediately. Finally I found that it is because the
postgreSQL JDBC driver is pre-fetching a lot (all?) of the records. To fix
the problem, one can call setFetchSize(50) on the statement.


-----
--
Kent Tong
Wicket tutorials freely available at http://www.agileskills2.org/EWDW
Axis2 tutorials freely available at http://www.agileskills2.org/DWSAA
--
View this message in context:
http://www.nabble.com/experience-sharing%3A-select-query-returns-more-records-than-necessary-tp21556532p21556532.html
Sent from the PostgreSQL - jdbc mailing list archive at Nabble.com.


Re: experience sharing: select query returns more records than necessary

From
"Dave Cramer"
Date:



In contrast, if I issue the query on the DB server in the psql console, it
returns records almost immediately. Finally I found that it is because the
postgreSQL JDBC driver is pre-fetching a lot (all?) of the records. To fix
the problem, one can call setFetchSize(50) on the statement.

psql doesnt do anything any differently. It just doesn't have to create objects, and you are likely local to the network.

And yes, setFetchSize is the correct way to handle this.

Dave

Re: experience sharing: select query returns more records than necessary

From
Oliver Jowett
Date:
Dave Cramer wrote:
>
>
>
>     In contrast, if I issue the query on the DB server in the psql
>     console, it
>     returns records almost immediately. Finally I found that it is
>     because the
>     postgreSQL JDBC driver is pre-fetching a lot (all?) of the records.
>     To fix
>     the problem, one can call setFetchSize(50) on the statement.
>
> psql doesnt do anything any differently. It just doesn't have to create
> objects, and you are likely local to the network.

Well, not exactly. The difference is that psql begins outputting as soon
as the first result arrives, while the driver (in non-cursor mode)
gathers the entire resultset before returning anything to the application.

So while the overall query execution time is going to be similar
(excepting object creation, etc), the query *latency* is higher with JDBC.

Though, I would suggest that if you really only care about the first 50
results, then put a LIMIT 50 in your query! The query planner may come
up with a better plan if you do that, too ..

If you want to "stream" the whole resultset, then fetchsize is certainly
the way to do it.

-O

Re: experience sharing: select query returns more records than necessary

From
Kent Tong
Date:


Oliver Jowett wrote:
>
> Dave Cramer wrote:
> Though, I would suggest that if you really only care about the first 50
> results, then put a LIMIT 50 in your query! The query planner may come
> up with a better plan if you do that, too ..
>
> If you want to "stream" the whole resultset, then fetchsize is certainly
> the way to do it.
>

Yeah, I was actually settling on "LIMIT 50" before coming across
setFetchSize().
However, the support for LIMIT varies from one DBMS to another. To keep
the code as portable as possible, I decided to go with setFetchSize().


-----
--
Kent Tong
Wicket tutorials freely available at http://www.agileskills2.org/EWDW
Axis2 tutorials freely available at http://www.agileskills2.org/DWSAA
--
View this message in context:
http://www.nabble.com/experience-sharing%3A-select-query-returns-more-records-than-necessary-tp21556532p21564410.html
Sent from the PostgreSQL - jdbc mailing list archive at Nabble.com.


Re: experience sharing: select query returns more records than necessary

From
"Dave Cramer"
Date:


On Tue, Jan 20, 2009 at 9:43 AM, Kent Tong <kent@cpttm.org.mo> wrote:



Oliver Jowett wrote:
>
> Dave Cramer wrote:
> Though, I would suggest that if you really only care about the first 50
> results, then put a LIMIT 50 in your query! The query planner may come
> up with a better plan if you do that, too ..
>
> If you want to "stream" the whole resultset, then fetchsize is certainly
> the way to do it.
>

Yeah, I was actually settling on "LIMIT 50" before coming across
setFetchSize().
However, the support for LIMIT varies from one DBMS to another. To keep
the code as portable as possible, I decided to go with setFetchSize().

Kent,

Check the difference in the plans. Using limit can invoke an index that might not otherwise be used. It can make orders of magnitude differences in these kinds of queries.

Dave