Thread: Why is outer Join way quicker?
Hi. We have found that by using an Outer Join it speeds up this query considerably. Why is that? According to the explain plan the hold up has to do with a single nexted loop and a "Materilize"? This makes no sense to me. Can some one explain. Thanks. -David The difference between these two queries is the first uses an outer join between u and t, while the second (the slower) uses a straight forward join). -- Query 1: SELECT u.upc, t.title, tot.ytd, tot.rtd FROM upc u LEFT OUTER JOIN title t ON t.tcode = u.tcode LEFT OUTER JOIN total tot ON tot.tcode = t.tcode AND tot.week = 200210 AND tot.region='TOTAL' WHERE u.upc LIKE '%0085392227%' ORDER BY title ASC LIMIT 500 OFFSET 0; -- Query 1: Actual Time: 1 second 0.03user 0.00system 0:00.96elapsed 3%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (576major+168minor)pagefaults 0swaps -- Query 1: QUERY PLAN: Limit (cost=2801.10..2801.10 rows=1 width=108) -> Sort (cost=2801.10..2801.10 rows=1 width=108) -> Nested Loop (cost=0.00..2801.09 rows=1 width=108) -> Nested Loop (cost=0.00..2796.06 rows=1 width=48) -> Seq Scan on upc u (cost=0.00..2791.43 rows=1 width=24) -> Index Scan using title_pkey on title t (cost=0.00..4.63 rows=1 width=24) -> Index Scan using total_week_tcode_ind on total tot (cost=0.00..5.01 rows=1 width=60) -- ======================================= -- Query 2 explain SELECT u.upc, t.title, tot.ytd, tot.rtd FROM upc u, title t left outer join total tot ON tot.tcode = t.tcode AND tot.week = 200210 AND tot.region='TOTAL' WHERE t.tcode = u.tcode and u.upc LIKE '%0085392227%' ORDER BY title ASC LIMIT 500 OFFSET 0; -- Query 2: Actual Time: 36 SECONDS!!! 0.01user 0.02system 0:35.33elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (576major+168minor)pagefaults 0swaps Limit (cost=541352.96..541352.96 rows=1 width=108) -> Sort (cost=541352.96..541352.96 rows=1 width=108) -> Nested Loop (cost=0.00..541352.95 rows=1 width=108) -> Seq Scan on upc u (cost=0.00..2791.43 rows=1 width=24) -> Materialize (cost=537241.84..537241.84 rows=105575 width=84) -> Nested Loop (cost=0.00..537241.84 rows=105575 width=84) -> Seq Scan on title t (cost=0.00..6228.75 rows=105575 width=24) -> Index Scan using total_week_tcode_ind on total tot (cost=0.00..5.01 rows=1 width=60) -- Output (from both Query 1 and Query 2): upc | title | ytd | rtd --------------+--------------------------------+--------+--------- 008539222773 | CATS AND DOGS | 65240 | 1080103 008539222772 | CATS AND DOGS-2001-PS | 47683 | 480374 008539222793 | CATS & DOGS | | 008539222753 | MATRIX/MATRIX REVISITED 2-PACK | 299 | 1395 008539222783 | SWORDFISH | 27992 | 234049 008539222782 | SWORDFISH | 136727 | 987219 (6 rows)
On Tue, 23 Apr 2002, David Link wrote: > Hi. > > We have found that by using an Outer Join it speeds up this query > considerably. > > Why is that? > > According to the explain plan the hold up has to do with a single nexted > loop and a "Materilize"? This makes no sense to me. Can some one > explain. > > Thanks. -David > > The difference between these two queries is the first uses an outer join > between u and t, while the second (the slower) uses a straight forward > join). See: http://www.postgresql.org/idocs/index.php?explicit-joins.html You can probably use an inner join as well, you just need to use the explicit join syntax.
Stephan Szabo wrote: > > On Tue, 23 Apr 2002, David Link wrote: > > > Hi. > > > > We have found that by using an Outer Join it speeds up this query > > considerably. > > > > Why is that? > > > > According to the explain plan the hold up has to do with a single nexted > > loop and a "Materilize"? This makes no sense to me. Can some one > > explain. > > > > Thanks. -David > > > > The difference between these two queries is the first uses an outer join > > between u and t, while the second (the slower) uses a straight forward > > join). > > See: > http://www.postgresql.org/idocs/index.php?explicit-joins.html > > You can probably use an inner join as well, you just need to use > the explicit join syntax. OK. I read that and it helps explain this behavior (new to 7.1) somewhat. Thank you. However, it also says that the planner spends some time figuring out which method of joining is best (a join (b join c)) vs. ((a join b) join c), etc. It says the user should not have to worry about explicitly mapping the join order when few tables are involved. In this case I am using three tables, though fairly large. See explain plans. You'll notice the explain plans show a difference of magnatude in cost. If the planner is checking those kind of things, it should pick up on this and choose the better join order. Thanks. David
On Wed, 24 Apr 2002, David Link wrote: > Stephan Szabo wrote: > > > > See: > > http://www.postgresql.org/idocs/index.php?explicit-joins.html > > > > You can probably use an inner join as well, you just need to use > > the explicit join syntax. > > OK. I read that and it helps explain this behavior (new to 7.1) > somewhat. Thank you. > > However, it also says that the planner spends some time figuring out > which method of joining is best (a join (b join c)) vs. ((a join b) join > c), etc. It says the user should not have to worry about explicitly > mapping the join order when few tables are involved. In this case I am > using three tables, though fairly large. See explain plans. > > You'll notice the explain plans show a difference of magnatude in cost. > If the planner is checking those kind of things, it should pick up on > this and choose the better join order. You are explicitly mapping the join order in both queries. I'm not 100% sure the page mentions, but I think it does... If you do a,b OUTER JOIN c, postgres takes that as a request to explicitly join b and c before any other joins with those tables, thus any plan that joins a to b or c before the outer join is not considered. The thing for explicitly mapping join order is because with a large number of tables the decision making process can be complicated and take a fair amount of time, so you can prune the number of possible join orders with explicitly using the sql92 join syntax.
Very Clear. It makes sense. Thanks Stephan. Stephan Szabo wrote: > > On Wed, 24 Apr 2002, David Link wrote: > > > Stephan Szabo wrote: > > > > > > See: > > > http://www.postgresql.org/idocs/index.php?explicit-joins.html > > > > > > You can probably use an inner join as well, you just need to use > > > the explicit join syntax. > > > > OK. I read that and it helps explain this behavior (new to 7.1) > > somewhat. Thank you. > > > > However, it also says that the planner spends some time figuring out > > which method of joining is best (a join (b join c)) vs. ((a join b) join > > c), etc. It says the user should not have to worry about explicitly > > mapping the join order when few tables are involved. In this case I am > > using three tables, though fairly large. See explain plans. > > > > You'll notice the explain plans show a difference of magnatude in cost. > > If the planner is checking those kind of things, it should pick up on > > this and choose the better join order. > > You are explicitly mapping the join order in both queries. I'm not 100% > sure the page mentions, but I think it does... If you do > a,b OUTER JOIN c, postgres takes that as a request to explicitly join > b and c before any other joins with those tables, thus any plan that > joins a to b or c before the outer join is not considered. The thing for > explicitly mapping join order is because with a large number of tables > the decision making process can be complicated and take a fair amount of > time, so you can prune the number of possible join orders with explicitly > using the sql92 join syntax.