Safe outer-join orderings - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Safe outer-join orderings |
Date | |
Msg-id | 13016.1171313026@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: Safe outer-join orderings
|
List | pgsql-hackers |
I looked into Ian Harding's problem reported here: http://archives.postgresql.org/pgsql-general/2007-02/msg00530.php It's possible to reproduce the problem in a simplified form in the regression database, using this query: explain select * from tenk1 a left join (tenk1 b inner join (tenk1 c left join tenk1 d using (unique1)) using (unique1))using (unique1); It works OK with default settings, but after setting join_collapse_limit = 2 it fails with "failed to build any 2-way joins". The reason is that 8.2's new logic for rearranging outer joins is getting confused. In this scenario we compute min_lefthand = (A) and min_righthand = (B C) for the upper outer join, while the lower outer join has min_lefthand = (C) and min_righthand = (D). Normally the upper's min_righthand would be (B C D) which would force joining in the syntactic order, but here we have been able to prove we can commute the two outer joins, so we exclude D from the upper min_righthand. This situation should allow several different join orders, including the syntactic ordering of (A (B (C D))). The problem is that make_join_rel() is being too strict. As explained in the planner README: : So the restriction enforced by make_join_rel is that a proposed join : can't join across a RHS boundary (ie, join anything inside the RHS : to anything else) unless the join validly implements some outer join. So we can form (C D) because it's legal according to the lower OJ, but when we consider forming (B (C D)) we conclude that it crosses the upper one's RHS and yet is not an outer join, and reject it. Normally this mistake is masked because the planner also considers ((B C) D) and decides that's legal; but if join_collapse_limit is too small then that path isn't explored and we end up not finding any legal join path. Ian's original problem query is complicated enough to expose the problem even at the default join_collapse_limit of 8. Quite aside from the possibility of failure, this is bad because we might be excluding the most efficient join order. As best I can see, it would be allowable for make_join_rel to accept an inner join that overlaps but isn't contained in an OJ RHS so long as (1) it doesn't overlap the LHS (otherwise it's a premature attempt to perform the outer join; and (2) *both* of the input relations overlap the OJ's RHS. Since each input relation must have been previously found valid, that means that whatever is "sticking out" of the RHS must have been legally joined to something inside the RHS by a previous outer join. I haven't quite convinced myself of it, but I think that we might not even need to test (1) separately, as legal inputs overlapping the RHS couldn't overlap the LHS. Comments? Is there still a hole here? regards, tom lane
pgsql-hackers by date: