The rewritting of join conditions caused a very slow query plan. - Mailing list pgsql-hackers
From | chang chao |
---|---|
Subject | The rewritting of join conditions caused a very slow query plan. |
Date | |
Msg-id | SG2PR06MB1149010B3687EB4743783DDF84770@SG2PR06MB1149.apcprd06.prod.outlook.com Whole thread Raw |
Responses |
Re: The rewritting of join conditions caused a very slow query plan.
|
List | pgsql-hackers |
<div id="divtagdefaultwrapper" style="font-size:12pt; color:#000000; background-color:#FFFFFF; font-family:Calibri,Arial,Helvetica,sans-serif"><p>Hi,all<p><br/> I have a query that is very slow,and the reason may bein the rewritting of join conditions.<br /><br /> this is the simplied version table and the key part of the sql.<br /><br/> level1_table and level2_table hold the tree data nodes,<br /> and all_level_status table holds the current statusall all nodes of all levels.<br /> (I know that there would be much less trouble in performance if all_level_statuswas divided into two tables,namely,level1_status and level2_status tables.)<br /><br /> table1: level1_table<br/> level1_no PK:serial <br /> level1_node_name :varchar<br /><br /> table2:level2_table<br /> level2_no PK:serial <br /> parent_no FK to level1_table.level1_no<br /> level2_node_name :varchar<br /><br /> table3:all_level_status<br /> level:1 OR 2 PK1<br /> node_no:level1_table.level1_no or level2_table.level2_no PK2<br/> status:0 OR 1(normal or abnormal)<br /><br /><br /> The sql to find all level2 nodes whose parent level nodesare in normal status.<br /><br /> explain analyze<br /> select * from level2_table l2<br /> join (<br /> select l1.*from level1_table l1<br /> join all_level_status als on (als.level=1 and als.node_no=l1.level1_no)<br /> where als.status=0<br/> ) normal_l1 on l2.parent_no=normal_l1.level1_no;<br /><br /><br /> this is the query plan .<p><br /> "MergeJoin (cost=3.38..5.13 rows=3 width=158) (actual time=0.087..0.179 rows=21 loops=1)"<br /> " Merge Cond: (als.node_no= l2.parent_no)"<br /> " -> Merge Join (cost=1.63..7.66 rows=19 width=80) (actual time=0.067..0.126 rows=18loops=1)"<br /> " Merge Cond: (als.node_no = l1.level1_no)"<br /> " -> Index Scan using all_level_status_pkeyon all_level_status als (cost=0.00..21.74 rows=19 width=4) (actual time=0.037..0.079 rows=18 loops=1)"<br/> " Index Cond: (level = 1)"<br /> " Filter: (status = 0)"<br /> " -> Sort (cost=1.63..1.68 rows=20 width=76) (actual time=0.026..0.026 rows=20 loops=1)"<br /> " Sort Key: l1.level1_no"<br/> " Sort Method: quicksort Memory: 27kB"<br /> " -> Seq Scan on level1_tablel1 (cost=0.00..1.20 rows=20 width=76) (actual time=0.005..0.009 rows=20 loops=1)"<br /> " -> Sort (cost=1.75..1.81rows=23 width=82) (actual time=0.016..0.024 rows=23 loops=1)"<br /> " Sort Key: l2.parent_no"<br />" Sort Method: quicksort Memory: 28kB"<br /> " -> Seq Scan on level2_table l2 (cost=0.00..1.23 rows=23width=82) (actual time=0.003..0.005 rows=23 loops=1)"<br /> "Total runtime: 0.307 ms"<br /><br /><br /> Please notethat,join condition of query plan line 2 is rewritten to "als.node_no = l2.parent_no"<br /> level1 and level2 nodes areof the 1:n relationship,and because all_level_status.node_no represents different things(level1_table.level1_no and level2_table.level2_nouses separate serials),so when this rewriting is applied,the statistics of mcvs of all_level_status.node_noand level2.parent_no will be used to do the row selectivity,as can be anticipated,a large gap occurredbetween actual rows and estimated rows.<br /><br /> the above sql is one simplified part of a long sql,because ofthis gap,the estimated row count becomes 1 in the outer sub-query,which in actual has large number of values,<br /> thevery slow nested-loop join is selected.<br /><br /> Had the rewriting of the join condition not be done,maybe a much fastquery plan would be selected.<br /><br /> So I'm wondering what is the reason behind the join condition rewriting,<br/> Is it just because that join conditions that both left and right side have mcvs are preferable to those inwhich there are no mcvs on both sides?<br /><p><br /><p>Chao.<br /><br /></div>
pgsql-hackers by date: