On Mon, Sep 23, 2024 at 12:56 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> What I'm
> thinking about is to test against joinrelids here instead of
> inputrelids. As the adjacent comment says, that would get the
> wrong answer for "pushed down" conditions, but we could revert
> to the pre-v16 hack of checking relevant join conditions in
> the loop further down, ie put back this kluge:
>
> if (RINFO_IS_PUSHED_DOWN(restrictinfo, joinrelids))
> {
> /*
> * If such a clause actually references the inner rel then join
> * removal has to be disallowed. We have to check this despite
> * the previous attr_needed checks because of the possibility of
> * pushed-down clauses referencing the rel.
> */
> if (bms_is_member(innerrelid, restrictinfo->clause_relids))
> return false;
> continue; /* else, ignore; not useful here */
> }
>
> That's ugly for sure, and it would only revert to the status
> quo ante. But I'm not seeing a better way right now.
It seems to me that this approach basically reverts the changes in
8538519db. I'm afraid this would re-introduce the bug fixed by that
commit. For instance, this approach would incorrectly remove the join
in the query below:
create table t (a int unique, b int);
explain (costs off)
select 1 from t t1 left join t t2 on t1.a = t2.a where t1.b = coalesce(t2.b, 0);
QUERY PLAN
------------------
Seq Scan on t t1
(1 row)
... because it does not realize that there might be references to the
innerrel in ECs. (Pre-v16 this join removal is prevented by the
delay_upper_joins flag.)
Besides, we have the logic that a PHV needn't prevent join removal if
it mentions the innerrel in ph_eval_at but its contained expression
doesn't actually reference the innerrel (see cad569205). I think the
proposed approach would also break this logic. For instance, in the
query below, the t2/t3 join should be removed but is prevented by the
PHV.
explain (costs off)
select 1 from t t1 left join
(select 1 as x from t t2 left join t t3 on t2.a = t3.a) s on true
where s.x = 1;
QUERY PLAN
------------------------------------------
Nested Loop Left Join
Filter: ((1) = 1)
-> Seq Scan on t t1
-> Materialize
-> Hash Left Join
Hash Cond: (t2.a = t3.a)
-> Seq Scan on t t2
-> Hash
-> Seq Scan on t t3
(9 rows)
Thanks
Richard