Re: [HACKERS] Perfomance bug in v10 - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: [HACKERS] Perfomance bug in v10 |
Date | |
Msg-id | 14902.1496365989@sss.pgh.pa.us Whole thread Raw |
In response to | Re: [HACKERS] Perfomance bug in v10 (David Rowley <david.rowley@2ndquadrant.com>) |
Responses |
Re: [HACKERS] Perfomance bug in v10
|
List | pgsql-hackers |
David Rowley <david.rowley@2ndquadrant.com> writes: > On 1 June 2017 at 04:16, Teodor Sigaev <teodor@postgrespro.ru> wrote: >> I found an example where v10 chooses extremely non-optimal plan: >> ... > This is all caused by get_variable_numdistinct() deciding that all > values are distinct because ntuples < DEFAULT_NUM_DISTINCT. Uh, no. I traced through this and found that with your hack in place, final_cost_nestloop was costing the desired nestloop paths at less than they were costed in HEAD. That makes no sense: surely, accounting for the fact that the executor might stop early should never result in a higher cost estimate than ignoring that possibility does. After some navel-gazing I realized that there is an ancient oversight in final_cost_nestloop's cost estimate for semi/anti-join cases. To wit, in its attempt to ensure that it always charges inner_run_cost at least once, it may end up charging that twice. Specifically, what we get in this case is outer_path_rows = 1, outer_matched_rows = 0 (implying one unmatched outer row) which will cause the existing logic to add both inner_run_cost and inner_rescan_run_cost to the cost estimate, as if we needed to run the inner plan twice not once. Correcting that, as in the attached draft patch, fixes Teodor's example. Now, this patch also causes some changes in the regression test outputs that are a bit like your patch's side-effects, but on close inspection I am not at all convinced that these changes are wrong. As an example, looking at the first change without "costs off": regression=# explain (verbose) select * from j1 inner join (select distinct id from j3) j3 on j1.id = j3.id; QUERY PLAN --------------------------------------------------------------------------- Nested Loop (cost=1.03..2.12 rows=1 width=8) Output: j1.id, j3.id Inner Unique: true Join Filter: (j1.id = j3.id) -> Unique (cost=1.03..1.04 rows=1 width=4) Output: j3.id -> Sort (cost=1.03..1.03 rows=2 width=4) Output: j3.id Sort Key: j3.id -> Seq Scan on public.j3 (cost=0.00..1.02 rows=2 width=4) Output: j3.id -> Seq Scan on public.j1 (cost=0.00..1.03 rows=3 width=4) Output: j1.id (13 rows) Note that both sides of this join are known unique, so that we'd produce an inner-unique-true join from either direction of joining. I don't think it's so insane to put the larger rel on the inside, because with a size-1 rel on the inside, there is nothing to be gained from stop-early behavior. Moreover, this way we don't need a Materialize node (since we're not predicting the inner to be scanned more than once anyway). So the fact that this way is estimated to be cheaper than the other way is not that surprising after all. Yeah, it's a bit brittle in the face of the outer rel producing more rows than we expect, but that's true of every nestloop join plan we ever make. Fixing that is not a task for v10. Teodor, could you check if this patch fixes your real-world problem? regards, tom lane diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index cdb18d9..324c8c1 100644 *** a/src/backend/optimizer/path/costsize.c --- b/src/backend/optimizer/path/costsize.c *************** final_cost_nestloop(PlannerInfo *root, N *** 2287,2306 **** * difficult to estimate whether that will happen (and it could * not happen if there are any unmatched outer rows!), so be * conservative and always charge the whole first-scan cost once. */ run_cost += inner_run_cost; /* Add inner run cost for additional outer tuples having matches */ ! if (outer_matched_rows > 1) ! run_cost += (outer_matched_rows - 1) * inner_rescan_run_cost * inner_scan_frac; ! ! /* Add inner run cost for unmatched outer tuples */ ! run_cost += (outer_path_rows - outer_matched_rows) * ! inner_rescan_run_cost; ! /* And count the unmatched join tuples as being processed */ ! ntuples += (outer_path_rows - outer_matched_rows) * ! inner_path_rows; } } else --- 2287,2317 ---- * difficult to estimate whether that will happen (and it could * not happen if there are any unmatched outer rows!), so be * conservative and always charge the whole first-scan cost once. + * We consider this charge to correspond to the first unmatched + * outer row, unless there isn't one in our estimate, in which + * case blame it on the first matched row. */ + double outer_unmatched_rows; + + outer_unmatched_rows = outer_path_rows - outer_matched_rows; + + /* While at it, count unmatched join tuples as being processed */ + ntuples += outer_unmatched_rows * inner_path_rows; + + /* Now add the forced full scan, and decrement appropriate count */ run_cost += inner_run_cost; + if (outer_unmatched_rows >= 1) + outer_unmatched_rows -= 1; + else + outer_matched_rows -= 1; /* Add inner run cost for additional outer tuples having matches */ ! if (outer_matched_rows > 0) ! run_cost += outer_matched_rows * inner_rescan_run_cost * inner_scan_frac; ! /* Add inner run cost for additional unmatched outer tuples */ ! if (outer_unmatched_rows > 0) ! run_cost += outer_unmatched_rows * inner_rescan_run_cost; } } else diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out index d08b1e1..9cad4e6 100644 *** a/src/test/regress/expected/join.out --- b/src/test/regress/expected/join.out *************** select * from j1 natural join j2; *** 5476,5523 **** explain (verbose, costs off) select * from j1 inner join (select distinct id from j3) j3 on j1.id = j3.id; ! QUERY PLAN ! ----------------------------------------------- Nested Loop Output: j1.id, j3.id Inner Unique: true Join Filter: (j1.id = j3.id) ! -> Seq Scan on public.j1 ! Output: j1.id ! -> Materialize Output: j3.id ! -> Unique Output: j3.id ! -> Sort Output: j3.id ! Sort Key: j3.id ! -> Seq Scan on public.j3 ! Output: j3.id ! (15 rows) -- ensure group by clause allows the inner to become unique explain (verbose, costs off) select * from j1 inner join (select id from j3 group by id) j3 on j1.id = j3.id; ! QUERY PLAN ! ----------------------------------------------- Nested Loop Output: j1.id, j3.id Inner Unique: true Join Filter: (j1.id = j3.id) ! -> Seq Scan on public.j1 ! Output: j1.id ! -> Materialize Output: j3.id ! -> Group Output: j3.id ! Group Key: j3.id ! -> Sort Output: j3.id ! Sort Key: j3.id ! -> Seq Scan on public.j3 ! Output: j3.id ! (16 rows) drop table j1; drop table j2; --- 5476,5519 ---- explain (verbose, costs off) select * from j1 inner join (select distinct id from j3) j3 on j1.id = j3.id; ! QUERY PLAN ! ----------------------------------------- Nested Loop Output: j1.id, j3.id Inner Unique: true Join Filter: (j1.id = j3.id) ! -> Unique Output: j3.id ! -> Sort Output: j3.id ! Sort Key: j3.id ! -> Seq Scan on public.j3 Output: j3.id ! -> Seq Scan on public.j1 ! Output: j1.id ! (13 rows) -- ensure group by clause allows the inner to become unique explain (verbose, costs off) select * from j1 inner join (select id from j3 group by id) j3 on j1.id = j3.id; ! QUERY PLAN ! ----------------------------------------- Nested Loop Output: j1.id, j3.id Inner Unique: true Join Filter: (j1.id = j3.id) ! -> Group Output: j3.id ! Group Key: j3.id ! -> Sort Output: j3.id ! Sort Key: j3.id ! -> Seq Scan on public.j3 Output: j3.id ! -> Seq Scan on public.j1 ! Output: j1.id ! (14 rows) drop table j1; drop table j2; *************** inner join j2 on j1.id1 = j2.id1 and j1. *** 5558,5570 **** Output: j1.id1, j1.id2, j2.id1, j2.id2 Inner Unique: true Join Filter: ((j1.id1 = j2.id1) AND (j1.id2 = j2.id2)) -> Seq Scan on public.j1 Output: j1.id1, j1.id2 ! -> Materialize ! Output: j2.id1, j2.id2 ! -> Seq Scan on public.j2 ! Output: j2.id1, j2.id2 ! (10 rows) -- ensure we don't detect the join to be unique when quals are not part of the -- join condition --- 5554,5564 ---- Output: j1.id1, j1.id2, j2.id1, j2.id2 Inner Unique: true Join Filter: ((j1.id1 = j2.id1) AND (j1.id2 = j2.id2)) + -> Seq Scan on public.j2 + Output: j2.id1, j2.id2 -> Seq Scan on public.j1 Output: j1.id1, j1.id2 ! (8 rows) -- ensure we don't detect the join to be unique when quals are not part of the -- join condition -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
pgsql-hackers by date: