Re: BUG #17978: Unexpected error: "wrong varnullingrels (b) (expected (b 5)) for Var 6/2" triggered by JOIN - Mailing list pgsql-bugs
From | Tom Lane |
---|---|
Subject | Re: BUG #17978: Unexpected error: "wrong varnullingrels (b) (expected (b 5)) for Var 6/2" triggered by JOIN |
Date | |
Msg-id | 3153450.1687208355@sss.pgh.pa.us Whole thread Raw |
In response to | Re: BUG #17978: Unexpected error: "wrong varnullingrels (b) (expected (b 5)) for Var 6/2" triggered by JOIN (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: BUG #17978: Unexpected error: "wrong varnullingrels (b) (expected (b 5)) for Var 6/2" triggered by JOIN
|
List | pgsql-bugs |
I wrote: > Anyway, what I'm inclined to do is flesh out the attached by updating > the comments for remove_useless_results() and then push it. Later > on we can look for why it's not finding the better join order; that's > a separable issue, and if it is about avoid-clauseless-joins then we > might choose to live with it rather than incur a lot of planner cost > to fix it. I couldn't resist poking into that, and it seems there's less here than meets the eye. I'd been guessing that the presence or absence of a t2 reference in the WHERE clause was affecting this, but no: the SpecialJoinInfos look exactly the same for both queries, and the set of joins considered is the same in both. What is causing the different plan shape is that the selectivity estimates for these WHERE clauses are a lot different: regression=# explain regression-# select t2.f1 from int4_tbl t2 regression-# left join int4_tbl t3 on t2.f1 > 0 regression-# where t3.f1 is null; QUERY PLAN ----------------------------------------------------------------------- Nested Loop Left Join (cost=0.00..2.49 rows=1 width=4) Join Filter: (t2.f1 > 0) Filter: (t3.f1 IS NULL) -> Seq Scan on int4_tbl t2 (cost=0.00..1.05 rows=5 width=4) -> Materialize (cost=0.00..1.07 rows=5 width=4) -> Seq Scan on int4_tbl t3 (cost=0.00..1.05 rows=5 width=4) (6 rows) regression=# explain regression-# select t2.f1 from int4_tbl t2 regression-# left join int4_tbl t3 on t2.f1 > 0 regression-# where t2.f1 <> coalesce(t3.f1, -1); QUERY PLAN ----------------------------------------------------------------------- Nested Loop Left Join (cost=0.00..2.55 rows=8 width=4) Join Filter: (t2.f1 > 0) Filter: (t2.f1 <> COALESCE(t3.f1, '-1'::integer)) -> Seq Scan on int4_tbl t2 (cost=0.00..1.05 rows=5 width=4) -> Materialize (cost=0.00..1.07 rows=5 width=4) -> Seq Scan on int4_tbl t3 (cost=0.00..1.05 rows=5 width=4) and that ends up with the other join order looking better. We can synthesize a different non-strict, t3-only qual with a similar selectivity estimate: regression=# explain regression-# select t2.f1 from int4_tbl t2 regression-# left join int4_tbl t3 on t2.f1 > 0 regression-# where -1 <> coalesce(t3.f1, -1); QUERY PLAN ----------------------------------------------------------------------- Nested Loop Left Join (cost=0.00..2.55 rows=8 width=4) Join Filter: (t2.f1 > 0) Filter: ('-1'::integer <> COALESCE(t3.f1, '-1'::integer)) -> Seq Scan on int4_tbl t2 (cost=0.00..1.05 rows=5 width=4) -> Materialize (cost=0.00..1.07 rows=5 width=4) -> Seq Scan on int4_tbl t3 (cost=0.00..1.05 rows=5 width=4) and then if you plug that into the whole query you get a t4-last plan: regression=# explain (costs off) regression-# select * from int4_tbl t1 regression-# left join ((select t2.f1 from int4_tbl t2 regression(# left join int4_tbl t3 on t2.f1 > 0 regression(# where -1 <> coalesce(t3.f1, -1) ) s regression(# left join tenk1 t4 on s.f1 > 1) regression-# on s.f1 = t1.f1; QUERY PLAN ------------------------------------------------------------------------- Nested Loop Left Join Join Filter: (t2.f1 > 1) -> Hash Right Join Hash Cond: (t2.f1 = t1.f1) -> Nested Loop Left Join Join Filter: (t2.f1 > 0) Filter: ('-1'::integer <> COALESCE(t3.f1, '-1'::integer)) -> Seq Scan on int4_tbl t2 -> Materialize -> Seq Scan on int4_tbl t3 -> Hash -> Seq Scan on int4_tbl t1 -> Materialize -> Seq Scan on tenk1 t4 (14 rows) So, nothing to see here after all. regards, tom lane
pgsql-bugs by date: