My fuzzer finds a logic bug in Postgres, which makes Postgres return inconsistent results.
Thank you for the report. This is surely a wrong result issue. Great catch!
I've looked into it a bit. The problem lies in how the SJE code handles the transfer of qual clauses from the removed relation to the remaining one. The code replaces the Vars of the removed relation with the Vars of the remaining relation for each clause. It then reintegrates these clauses into the appropriate restriction or join clause lists, while attempting to avoid duplicates. So far so good.
However, the code compares RestrictInfo->clause to determine if two clauses are duplicates. This is just flat wrong. Two RestrictInfos with the same clause can have different required_relids, incompatible_relids, is_pushed_down and so on.
Here is a simple example to illustrate this issue.
create table t (a int primary key, b int); insert into t select 1,1;
-- wrong plan. the full join should be dummy explain (costs off) select 1 from t full join (select * from t t1 join t t2 join t t3 on t2.a = t3.a on true where false) s on true where false; QUERY PLAN -------------------------------------- Merge Full Join -> Seq Scan on t -> Materialize -> Result One-Time Filter: false (5 rows)
-- wrong result select 1 from t full join (select * from t t1 join t t2 join t t3 on t2.a = t3.a on true where false) s on true where false; ?column? ---------- 1 (1 row)
In this query there are two RestrictInfos whose clause are both const-false: one is supposed to be evaluated at join t1/t3, and the other is supposed to be evaluated above the full join. But the SJE code mistakenly thinks that they are duplicates, so the one above the full join is just abandoned.
Do we really need to avoid duplicates here? We do not do that before. For instance,
explain (costs off) select * from t where b > 1 and b > 1; QUERY PLAN --------------------------------- Seq Scan on t Filter: ((b > 1) AND (b > 1)) (2 rows)
So I will not be surprised if I see redundant 'b > 1' after the join removal in the following plan.
explain (costs off) select * from t t1 join t t2 on t1.a = t2.a where t1.b > 1 and t2.b > 1; QUERY PLAN ----------------------------------------- Seq Scan on t t2 Filter: ((a IS NOT NULL) AND (b > 1)) (2 rows)
If we determine that avoiding duplicates is necessary, I think at least we should compare the entire RestrictInfos not just their clauses. One challenge with this approach is that the 'rinfo_serial' usually differs, making direct comparison problematic. I'm wondering if we can make 'rinfo_serial' equal_ignore. Not too sure about that.