Thread: large query by offset and limt
Hi, I am ruuning a database behind a webserver and there is a table which is huge. I need to pull data from this table and send to user through http. If I use select * from huge_table where userid = 100 It will return millions of records which exhuasts my server's memory. So I do this: select * from huge_table where userid = 100 limit 1000 offset 0 and then send the results to user, then select * from huge_table where userid = 100 limit 1000 offset 1000 and then send the results to user, then select * from huge_table where userid = 100 limit 1000 offset 2000 and then send the results to user, Continue this until there is no records available It runs great but it is kind of slow. I think it is because even I need only 1000 records, the query search the whole table every time. Is there a better way to do this? Thank you. ff
On May 2, 2008, at 2:01 PM, finecur wrote: > Hi, I am ruuning a database behind a webserver and there is a table > which is huge. I need to pull data from this table and send to user > through http. If I use > > select * from huge_table where userid = 100 > > It will return millions of records which exhuasts my server's memory. > So I do this: > > select * from huge_table where userid = 100 limit 1000 offset 0 > and then send the results to user, then > > select * from huge_table where userid = 100 limit 1000 offset 1000 > and then send the results to user, then > > select * from huge_table where userid = 100 limit 1000 offset 2000 > and then send the results to user, > > Continue this until there is no records available > > It runs great but it is kind of slow. I think it is because even I > need only 1000 records, the query search the whole table every time. Not quite - if you do a "limit 1000 offset 5000" it'll stop after retrieving the first 6000 from the table. A bigger problem with doing it this way is that the results aren't particularly well defined unless there's an order by statement in the query. > > > Is there a better way to do this? You want a cursor. See http://www.postgresql.org/docs/8.3/static/sql-declare.html Cheers, Steve
finecur wrote: > Hi, I am ruuning a database behind a webserver and there is a table > which is huge. I need to pull data from this table and send to user > through http. If I use > > select * from huge_table where userid = 100 > > It will return millions of records which exhuasts my server's memory. > Is there a better way to do this? Consider using database cursors. Set up by beginning a transaction and issuing the statement: DECLARE huge_table_curs CURSOR FOR SELECT * FROM huge_table WHERE userid = 100; ... then to get results, just execute: FETCH 1000 FROM huge_table_curs; -- Craig Ringer
Thank you very much. Could you show me how to do it in JDBC? Craig Ringer wrote: > finecur wrote: > > Hi, I am ruuning a database behind a webserver and there is a table > > which is huge. I need to pull data from this table and send to user > > through http. If I use > > > > select * from huge_table where userid = 100 > > > > It will return millions of records which exhuasts my server's memory. > > > Is there a better way to do this? > > Consider using database cursors. Set up by beginning a transaction and > issuing the statement: > > DECLARE huge_table_curs CURSOR FOR > SELECT * FROM huge_table WHERE userid = 100; > > ... then to get results, just execute: > > FETCH 1000 FROM huge_table_curs; > > -- > Craig Ringer > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general
Ge Cong wrote: > Thank you very much. Could you show me how to do it in JDBC? Here's one example. As I haven't been using JDBC directly it's probably horrible, but it'll do the job. Any exception will terminate this example, but in practice you'd want to catch and handle exceptions appropriately. Sorry about the ugly formatting - mail client line wrapping and all. The example uses a dummy "customer" table, scrolling through it in chunks of 1000 records and printing the primary key `id' for each record. ---- import java.sql.Connection; import java.sql.DriverManager; import java.sql.ResultSet; import java.sql.SQLException; import java.sql.Statement; public class Main { private static final int BATCH_SIZE = 1000; public static void main(String[] args) throws ClassNotFoundException, SQLException { // Load the JDBC driver Class.forName("org.postgresql.Driver"); // Initialize a read only connection Connection c = DriverManager.getConnection( "jdbc:postgresql:DBNAME", "USERNAME", "PASSWORD"); c.setReadOnly(true); c.setAutoCommit(false); // Declare an open cursor attached to a query for the // desired information Statement s = c.createStatement(); s.execute("DECLARE customer_curs CURSOR FOR" + " SELECT id FROM customer"); // and fetch BATCH_SIZE records from the cursor until fewer // than the requested number of records are returned (ie // until we've run out of results). int nresults = 0; do { s.execute("FETCH " + BATCH_SIZE + " FROM customer_curs"); ResultSet rs = s.getResultSet(); while (rs.next()) { nresults++; // Do something with the current record at `rs' System.out.println("CustomerID: " + rs.getString(1)); } } while (nresults == BATCH_SIZE); // Clean up. c.close(); } } ----