Thread: How can I make PostgreSQL to select another quey plan?
HI, dear all I'm trying to force the database to select another query plan for a query, rather than the optimal one. I changed query tuning related parameters, but the query plan stays the same. Although the costs for nodes changed correspondingly, What parameters can I modify to force the DB to select another plan? Any hints will be helpful. Thanks! -Yangyang
On Wed, Nov 30, 2011 at 10:51 AM, Yangyang <yangyangbm@gmail.com> wrote:
What is the database structure, query, and plan? What do you want to have changed?
HI, dear all
I'm trying to force the database to select another query plan for a
query, rather than the optimal one.
I changed query tuning related parameters, but the query plan stays
the same. Although the costs for nodes changed correspondingly,
What parameters can I modify to force the DB to select another plan?
Any hints will be helpful.
Sean
Thank you so much for the reply, Sean. 1. I created tables and populated data with DBT-3. 2. I'm using TPC-H Q4. I may also test more TPC-H queries. 3. The plan is like: Sort (cost=1219522013.16..1219522013.17 rows=1 width=16) Sort Key: orders.o_orderpriority -> HashAggregate (cost=1219521913.15..1219522013.15 rows=1 width=16) -> Seq Scan on orders (cost=0.00..1219521627.24 rows=57181 width=16) Filter: ((o_orderdate >= '1993-07-01'::date) AND (o_orderdate < '1993-10-01 00:00:00'::timestamp without time zone) AND (subplan)) SubPlan -> Index Scan using lineitem_pkey on lineitem (cost=0.00..28502.87 rows=93 width=112) Index Cond: (l_orderkey = $0) Filter: (l_commitdate < l_receiptdate) (9 rows) 4. My question is, is it possible for me to modify some parameters, therefore the query plan is changed? I have tried some in the postgres.conf file, but no matter which one I change, the query plan is the same. 5. My assumption is, if one type of resource is treated more valuable, it would affect how the DB evaluate costs and thus may choose a different query plan. Thank you so much for your time and help. -Yangyang On Wed, Nov 30, 2011 at 10:57 AM, Sean Davis <sdavis2@mail.nih.gov> wrote: > > > On Wed, Nov 30, 2011 at 10:51 AM, Yangyang <yangyangbm@gmail.com> wrote: >> >> HI, dear all >> I'm trying to force the database to select another query plan for a >> query, rather than the optimal one. >> I changed query tuning related parameters, but the query plan stays >> the same. Although the costs for nodes changed correspondingly, >> What parameters can I modify to force the DB to select another plan? >> Any hints will be helpful. > > > What is the database structure, query, and plan? What do you want to have > changed? > > Sean
Yangyang <yangyangbm@gmail.com> writes: > I'm trying to force the database to select another query plan for a > query, rather than the optimal one. > I changed query tuning related parameters, but the query plan stays > the same. Although the costs for nodes changed correspondingly, > What parameters can I modify to force the DB to select another plan? There are the enable_xxx flags, or if the problem is a bad join order you could reduce the join_collapse_limit setting to 1 and use JOIN syntax to specify the join order. If you'd like it to not use a particular index you could transiently drop that index: begin; drop index ...; explain ...; rollback; I recently rewrote the manual's discussion of using explain, so you might care to read this: http://developer.postgresql.org/pgdocs/postgres/using-explain.html although it does refer to a couple of features that don't exist in production releases yet. regards, tom lane
Thanks, Tom. This helps. If I don't change the database structures (like index), and only modify "Planner cost constants", is it possible to make DB change the query plan? I've tried, but looks like only chaning these constants don't affect the query plan. Or maybe I'm wrong. I appreciate your time and help. -Yangyang On Wed, Nov 30, 2011 at 11:11 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Yangyang <yangyangbm@gmail.com> writes: >> I'm trying to force the database to select another query plan for a >> query, rather than the optimal one. >> I changed query tuning related parameters, but the query plan stays >> the same. Although the costs for nodes changed correspondingly, >> What parameters can I modify to force the DB to select another plan? > > There are the enable_xxx flags, or if the problem is a bad join order > you could reduce the join_collapse_limit setting to 1 and use JOIN > syntax to specify the join order. If you'd like it to not use a > particular index you could transiently drop that index: > > begin; > drop index ...; > explain ...; > rollback; > > I recently rewrote the manual's discussion of using explain, so > you might care to read this: > http://developer.postgresql.org/pgdocs/postgres/using-explain.html > although it does refer to a couple of features that don't exist in > production releases yet. > > regards, tom lane
Yangyang <yangyangbm@gmail.com> writes: > If I don't change the database structures (like index), and only > modify "Planner cost constants", is it possible to make DB change the > query plan? Yes, but usually these don't have huge impacts on estimated costs, so you will only be able to get it to switch to plans that (it thinks) are relatively close in cost anyway. A bigger question is what sort of plan you think would be better, and whether the planner is even capable of building that plan from the given query. You didn't show the particular query you're concerned about, but I gather from the EXPLAIN output that it involves sub-selects, and those can act as optimization fences ... regards, tom lane
Thank you for the advice, Tom. My proposal is like this: Suppose for the same query, Plan A consumes 10 units of I/O and 50 units of CPU. Plan B consumes 20 units of I/O and 40 units of CPU. If A has less total cost than B, A will be selected as best plan. If the database has less CPU available, I would prefer it to select Plan B, which consumes less units of CPU. I tried to limit the CPU to DB and modify cpu_tupe_cost, but no good. I'm wondering if I need a particular query to do so. The query I used is : select l_returnflag, l_linestatus, sum(l_quantity) as sum_qty, sum(l_extendedprice) as sum_base_price, sum(l_extendedprice * (1 - l_discount)) as sum_disc_price, sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge, avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price, avg(l_discount) as avg_disc, count(*) as count_order from lineitem where l_shipdate <= date('1998-12-01') - interval '90 days' group by l_returnflag, l_linestatus order by l_returnflag, l_linestatus; Hope this time I made it clear. I apologize for any inconvenience caused by my post. On Wed, Nov 30, 2011 at 11:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Yangyang <yangyangbm@gmail.com> writes: >> If I don't change the database structures (like index), and only >> modify "Planner cost constants", is it possible to make DB change the >> query plan? > > Yes, but usually these don't have huge impacts on estimated costs, > so you will only be able to get it to switch to plans that (it thinks) > are relatively close in cost anyway. > > A bigger question is what sort of plan you think would be better, and > whether the planner is even capable of building that plan from the > given query. You didn't show the particular query you're concerned > about, but I gather from the EXPLAIN output that it involves > sub-selects, and those can act as optimization fences ... > > regards, tom lane
Yangyang <yangyangbm@gmail.com> writes: > My proposal is like this: > Suppose for the same query, Plan A consumes 10 units of I/O and 50 > units of CPU. Plan B consumes 20 units of I/O and 40 units of CPU. > If A has less total cost than B, A will be selected as best plan. > If the database has less CPU available, I would prefer it to select > Plan B, which consumes less units of CPU. No doubt, but the question I was asking is whether you have a clear idea of what Plan B is and whether the Postgres query planner can even generate that plan from your query. > The query I used is : > select > l_returnflag, > l_linestatus, > sum(l_quantity) as sum_qty, > sum(l_extendedprice) as sum_base_price, > sum(l_extendedprice * (1 - l_discount)) as sum_disc_price, > sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge, > avg(l_quantity) as avg_qty, > avg(l_extendedprice) as avg_price, > avg(l_discount) as avg_disc, > count(*) as count_order > from > lineitem > where > l_shipdate <= date('1998-12-01') - interval '90 days' > group by > l_returnflag, > l_linestatus > order by > l_returnflag, > l_linestatus; Hmm ... after looking up the TPC-H spec, that seems to correspond to Q1, whereas earlier you said you were using Q4. Since your plan shows a sub-select, and Q1 contains no sub-select but Q4 does, I'm going to suppose you meant Q4. Now the next question is why you're getting a subplan at all. I'd expect Postgres 8.4 and up to recognize Q4's WHERE EXISTS(SELECT ...) as a semi-join, rather than implementing it as an unoptimized sub-select which is what your plan is showing. If I'm right in deducing that you are testing an ancient version of Postgres, then the first step to improvement is to get onto a modern version. The unoptimized sub-select framework does not provide very much room at all for alternative plans --- it's basically always going to amount to a nestloop join. Which is probably why you're not getting anywhere by fooling with the planner cost settings. regards, tom lane