Thread: Cursor-based results: bafflingly slow
Dear list, I'm using setFetchSize(100) to stream results from the server, and I'm really puzzled by how slow some queries are to execute. I don't think I'm doing anything obviously stupid: I've setAutoCommit(false) and so forth. For example, here's part of the output from loglevel=2: 13:03:08.871 (2) FE=> Parse(stmt=S_9,query="select polypeptide.feature_id , featureprop_type_cv.name as type_cv , featureprop_type.name as type , featureprop.value from feature polypeptide join featureprop on featureprop.feature_id = polypeptide.feature_id join cvterm featureprop_type on featureprop.type_id = featureprop_type.cvterm_id join cv featureprop_type_cv on featureprop_type.cv_id = featureprop_type.cv_id where polypeptide.type_id in ( 191 ) and polypeptide.organism_id = ( select organism_id from organism where common_name = $1 ) order by polypeptide.feature_id",oids={1043}) 13:03:08.877 (2) FE=> Bind(stmt=S_9,portal=C_10,$1=<Etenella>) 13:03:08.878 (2) FE=> Describe(portal=C_10) 13:03:08.878 (2) FE=> Execute(portal=C_10,limit=100) 13:03:08.878 (2) FE=> Sync (The database schema is Chado – http://gmod.org/wiki/Schema) It has been running for more than half an hour with no apparent response yet. If I try to do something equivalent in psql, I can't find any way to make it take more than a few seconds. For example, if I use "limit 100" to simulate the effect of fetching the first 100 results: prepare query as select polypeptide.feature_id , featureprop_type_cv.name as type_cv , featureprop_type.name as type , featureprop.value from feature polypeptide join featureprop on featureprop.feature_id = polypeptide.feature_id join cvterm featureprop_type on featureprop.type_id = featureprop_type.cvterm_id join cv featureprop_type_cv on featureprop_type.cv_id = featureprop_type.cv_id where polypeptide.type_id in ( 191 ) and polypeptide.organism_id = ( select organism_id from organism where common_name = $1 ) order by polypeptide.feature_id limit 100 ; explain analyze execute query('Etenella'); then I get: Limit (cost=1.95..42662.18 rows=100 width=612) (actual time=219.367..253.029 rows=100 loops=1) InitPlan -> Seq Scan on organism (cost=0.00..1.95 rows=1 width=4) (actual time=8.175..8.196 rows=1 loops=1) Filter: ((common_name)::text = $1) -> Nested Loop (cost=0.00..2245634.37 rows=5264 width=612) (actual time=219.358..252.312 rows=100 loops=1) -> Nested Loop (cost=0.00..2245341.09 rows=188 width=96) (actual time=186.960..218.784 rows=4 loops=1) -> Nested Loop (cost=0.00..1948637.05 rows=37696 width=56) (actual time=166.944..175.615 rows=4 loops=1) -> Index Scan using feature_genedb_idx1 on feature polypeptide (cost=0.00..108669.87 rows=32192 width=4) (actual time=66.732..66.732 rows=1 loops=1) Index Cond: ((organism_id = $0) AND (type_id = 191)) -> Index Scan using featureprop_idx1 on featureprop (cost=0.00..55.64 rows=121 width=56) (actual time=100.187..108.833 rows=4 loops=1) Index Cond: (featureprop.feature_id = polypeptide.feature_id) -> Index Scan using cvterm_pkey on cvterm featureprop_type (cost=0.00..7.86 rows=1 width=52) (actual time=10.766..10.771 rows=1 loops=4) Index Cond: (featureprop_type.cvterm_id = featureprop.type_id) Filter: (featureprop_type.cv_id = featureprop_type.cv_id) -> Seq Scan on cv featureprop_type_cv (cost=0.00..1.28 rows=28 width=516) (actual time=8.101..8.193 rows=25 loops=4) Total runtime: 270.637 ms Any suggestions? Thanks, Robin
Robin Houston wrote: > For example, if I use "limit 100" to simulate the effect of fetching > the first 100 results: They're not really equivalent. A portal with a limited fetchsize is more like DECLARE CURSOR, not LIMIT. What does the plan look like if you plan it without the LIMIT? -O
2009/7/3 Oliver Jowett <oliver@opencloud.com>:
> They're not really equivalent. A portal with a limited fetchsize is more
> like DECLARE CURSOR, not LIMIT.
>
> What does the plan look like if you plan it without the LIMIT?
Hmm, well... If I do the same as before, but without the limit, then the plan is:
> They're not really equivalent. A portal with a limited fetchsize is more
> like DECLARE CURSOR, not LIMIT.
>
> What does the plan look like if you plan it without the LIMIT?
Sort (cost=353822.56..353835.72 rows=5264 width=612)
Sort Key: polypeptide.feature_id
InitPlan
-> Seq Scan on organism (cost=0.00..1.95 rows=1 width=4)
-> Hash Join (cost=88663.00..351946.66 rows=188 width=96)
Hash Cond: (featureprop.feature_id = polypeptide.feature_id)
-> Hash Join (cost=2790.36..264059.63 rows=51268 width=96)
Hash Cond: (featureprop.type_id = featureprop_type.cvterm_id)
-> Seq Scan on featureprop (cost=0.00..222284.97 rows=10259097 width=56)
-> Hash (cost=2785.51..2785.51 rows=388 width=52)
-> Seq Scan on cvterm featureprop_type (cost=0.00..2785.51 rows=388 width=52)
Filter: (cv_id = cv_id)
-> Hash (cost=85344.24..85344.24 rows=32192 width=4)
-> Bitmap Heap Scan on feature polypeptide (cost=827.57..85344.24 rows=32192 width=4)
Recheck Cond: ((organism_id = $0) AND (type_id = 191))
-> Bitmap Index Scan on feature_genedb_idx1 (cost=0.00..819.52 rows=32192 width=0)
Index Cond: ((organism_id = $0) AND (type_id = 191))
-> Materialize (cost=1.31..1.59 rows=28 width=516)
-> Seq Scan on cv featureprop_type_cv (cost=0.00..1.28 rows=28 width=516)
Sort Key: polypeptide.feature_id
InitPlan
-> Seq Scan on organism (cost=0.00..1.95 rows=1 width=4)
Filter: ((common_name)::text = $1)
-> Nested Loop (cost=88664.31..352053.24 rows=5264 width=612)-> Hash Join (cost=88663.00..351946.66 rows=188 width=96)
Hash Cond: (featureprop.feature_id = polypeptide.feature_id)
-> Hash Join (cost=2790.36..264059.63 rows=51268 width=96)
Hash Cond: (featureprop.type_id = featureprop_type.cvterm_id)
-> Seq Scan on featureprop (cost=0.00..222284.97 rows=10259097 width=56)
-> Hash (cost=2785.51..2785.51 rows=388 width=52)
-> Seq Scan on cvterm featureprop_type (cost=0.00..2785.51 rows=388 width=52)
Filter: (cv_id = cv_id)
-> Hash (cost=85344.24..85344.24 rows=32192 width=4)
-> Bitmap Heap Scan on feature polypeptide (cost=827.57..85344.24 rows=32192 width=4)
Recheck Cond: ((organism_id = $0) AND (type_id = 191))
-> Bitmap Index Scan on feature_genedb_idx1 (cost=0.00..819.52 rows=32192 width=0)
Index Cond: ((organism_id = $0) AND (type_id = 191))
-> Seq Scan on cv featureprop_type_cv (cost=0.00..1.28 rows=28 width=516)
Surely the server is not so stupid as to use an unnecessary server-side sort, if I've explicitly asked to fetch the rows 100 at a time?
Is there any easy way to find out what execution plan the server is actually using? I suppose I could ask the DBAs to enable auto_explain on the dev database.
If this *is* the problem, do you know any way round it. Obviously I want the plan that *doesn't* involve sorting everything before returning anything.
If I explicitly create a cursor, with no limit, then it does return the first 100 rows very quickly. On the other hand, I have to hard-code the parameter this way (because I don't know any way, in psql, to use a bind variable with an explicit cursor.)
begin;
declare polypeptide_props_c no scroll cursor for [...query...]
fetch forward 100 from polypeptide_props_c;
rollback;
This fetch completes in 31.313 ms.
Thanks for your help so far!
Robin
Robin Houston wrote: > 2009/7/3 Oliver Jowett <oliver@opencloud.com <mailto:oliver@opencloud.com>>: >> They're not really equivalent. A portal with a limited fetchsize is more >> like DECLARE CURSOR, not LIMIT. >> >> What does the plan look like if you plan it without the LIMIT? > > Hmm, well... If I do the same as before, but without the limit, then the > plan is: [.. different plan with an outermost Sort step ..] > Surely the server is not so stupid as to use an unnecessary server-side > sort, if I've explicitly asked to fetch the rows 100 at a time? I'm not too familiar with the planner any more, but as I understand it this is what's happening: Think about the difference between (1) SELECT * FROM hugetable ORDER BY some_key; (2) SELECT * FROM hugetable ORDER BY some_key LIMIT 100; where there is a suitable index on some_key. For (1) a seqscan + local sort is likely to be faster (because a sequential scan is faster than doing a random-access index scan over the entire table) For (2) an index scan is likely to be faster, because you'll stop after at most 100 random-access lookups, which is cheaper than scanning the entire table. If you take case (1) and set the fetchsize to 100, that doesn't actually change things at all - it doesn't turn into case (2)! Portals let you grab results a bit at a time (and let you give up on the query halfway through easily), but they don't change the underlying plan at all; the server plans on the assumption that the query will run to completion. In fact, it doesn't even know the fetchsize that will eventually be used during execution when it is doing the planning step. (If you look back at your connection tracing, you'll see that the fetchsize is only provided in the Execute message; query planning happens back when the Parse message is sent) > Is there any easy way to find out what execution plan the server is > actually using? I suppose I could ask the DBAs to enable auto_explain on > the dev database. You should be able to issue an EXPLAIN ANALYZE via JDBC; I can't remember exactly how the results get back to you though (might be SQLWarnings, or a separate resultset) > If this *is* the problem, do you know any way round it. Obviously I want > the plan that *doesn't* involve sorting everything before returning > anything. If you want to stream results promptly, you may be stuck with removing that ORDER BY. I'm not sure if there's any way to say to the planner "give me a plan with a low startup cost, even if it has a higher total cost" which seems to be what you want here. Or if you really do only care about the first N rows, then put a LIMIT in your query. > If I explicitly create a cursor, with no limit, then it does return the > first 100 rows very quickly. On the other hand, I have to hard-code the > parameter this way (because I don't know any way, in psql, to use a bind > variable with an explicit cursor.) I came across some (very old) posts in the archives that suggested that DECLARE CURSOR has a fudge where it ranks plan costs assuming that only some rows (10%?) will actually be fetched. I'm not sure if that's true of the current planner. -O
Oliver Jowett <oliver@opencloud.com> writes: > If you want to stream results promptly, you may be stuck with removing > that ORDER BY. I'm not sure if there's any way to say to the planner > "give me a plan with a low startup cost, even if it has a higher total > cost" which seems to be what you want here. You need to use a cursor for that. The problem with the fetchsize mechanism is that it doesn't give the planner any clue that the user might not intend to fetch the whole result, or might prefer low startup cost to low total cost anyway. > I came across some (very old) posts in the archives that suggested that > DECLARE CURSOR has a fudge where it ranks plan costs assuming that only > some rows (10%?) will actually be fetched. I'm not sure if that's true > of the current planner. Exactly. As of 8.4 you can control the percentage, too. regards, tom lane
2009/7/3 Tom Lane <tgl@sss.pgh.pa.us>
You need to use a cursor for that. The problem with the fetchsizemechanism is that it doesn't give the planner any clue that the user
might not intend to fetch the whole result, or might prefer low startup
cost to low total cost anyway.
Thank you very much (and Oliver too). That's what I needed to know. I'll change my code to use explicit cursors.
Robin