Re: Clause accidentally pushed down ( Possible bug in Making Vars outer-join aware) - Mailing list pgsql-bugs
From | Tom Lane |
---|---|
Subject | Re: Clause accidentally pushed down ( Possible bug in Making Vars outer-join aware) |
Date | |
Msg-id | 524303.1677538463@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Clause accidentally pushed down ( Possible bug in Making Vars outer-join aware) (Andrey Lepikhov <a.lepikhov@postgrespro.ru>) |
Responses |
Re: Clause accidentally pushed down ( Possible bug in Making Vars outer-join aware)
|
List | pgsql-bugs |
Andrey Lepikhov <a.lepikhov@postgrespro.ru> writes: > Ok, maybe my language still not so fluent, as I think sometimes. In > accordance to the example above: > 1. varnullingrels contains relids of entries which can null the var. In > our case it (correctly) contains t3 and OJ(t3,t4) > 2. Syntactic scope of the clause correctly contains all relations and OJs > 3. In the distribute_qual_to_rels I don't see a code which should > disallow pushing down a clause from higher syntactic level into nullable > side of OJ. Where is the logic which should limit the lowest level of > the clause push down? It's the same logic that prevents push-down to joins that lack a relation you're trying to reference, namely the checks of RestrictInfo.required_relids. If a clause should be referring to the possibly-nulled version of some variable, then its required_relids now include the relid of the outer join that nulls that variable, so we can't push it to below that outer join. This is straightforward enough until you start applying outer join identity 3: 3. (A leftjoin B on (Pab)) leftjoin C on (Pbc) = A leftjoin (B leftjoin C on (Pbc)) on (Pab) granting that Pbc is strict for B Suppose that we are given a FROM clause in the first form and that there's a non-strict WHERE clause referencing a column of C. That clause should certainly see the nulled version of C, so if we implement the join as written then we can apply the WHERE clause as a filter condition (not join condition) on the join of A/B to C. However, if we decide to implement the join according to the second form of the identity, then the current code believes it can push the WHERE clause down so that it's a filter on the join of B to C. It's not quite obvious that that's wrong, but it is: we might see C values that are non-null at this point but will become null after application of the A/B join. Then the WHERE clause gives misleading answers, which is exactly what's going wrong in this example. The way I'm proposing to fix this is to say that if the B/C join is done first, then its output C columns do not yet satisfy the nulling action of the syntactically-upper outer join, so we can't yet apply clauses that need to see those outputs. We can implement that by not adding the outer join's relid to the joinrelids produced by the B/C join: then clauses containing such Vars won't look like they can be pushed down to that join. Instead they'll get pushed to the AB/C join, and to make that happen we'll have to add both outer-join relids to the identifying relids of that join level. BTW, there's no problem if we start from the second form of the identity and transform to the first, because then any C references above the join nest are already marked as requiring the nulling actions of both of the joins, so they won't fall below the upper join no matter which order we do the joins in. So another way we could fix this is to run around and forcibly mark all such Vars with the relids of both outer joins, thus producing a parse tree that looks more like the second form of the identity. However, we're pretty much committed already to keeping all Vars marked according to the syntactic tree structure, so that solution would require revisiting a lot of code (not to mention the cost of mutating the parse tree like that). I don't want to go that way unless really forced into it. I have a WIP patch that does it like this, and it fixes the presented case; but it's not complete so it's still failing some existing regression tests. More later. regards, tom lane
pgsql-bugs by date: