Home > mailing lists

Re: BUG #18261: Inconsistent results of SELECT affected by joined subqueries - Mailing list pgsql-bugs

From	Richard Guo
Subject	Re: BUG #18261: Inconsistent results of SELECT affected by joined subqueries
Date	December 28, 2023 11:01:47
Msg-id	CAMbWs48_EAPrc_C5qvK2WGvydHvht1JUnXwer9FVeU7t_zA+MQ@mail.gmail.com Whole thread Raw
In response to	BUG #18261: Inconsistent results of SELECT affected by joined subqueries (PG Bug reporting form <noreply@postgresql.org>)
Responses	Re: BUG #18261: Inconsistent results of SELECT affected by joined subqueries
List	pgsql-bugs

Tree view

On Wed, Dec 27, 2023 at 7:36 PM PG Bug reporting form <noreply@postgresql.org> wrote:

My fuzzer finds a logic bug in Postgres, which makes Postgres return
inconsistent results.

Thank you for the report. This is surely a wrong result issue. Great
catch!

I've looked into it a bit. The problem lies in how the SJE code handles
the transfer of qual clauses from the removed relation to the remaining
one. The code replaces the Vars of the removed relation with the Vars
of the remaining relation for each clause. It then reintegrates these
clauses into the appropriate restriction or join clause lists, while
attempting to avoid duplicates. So far so good.

However, the code compares RestrictInfo->clause to determine if two
clauses are duplicates. This is just flat wrong. Two RestrictInfos
with the same clause can have different required_relids,
incompatible_relids, is_pushed_down and so on.

Here is a simple example to illustrate this issue.

create table t (a int primary key, b int);
insert into t select 1,1;

-- wrong plan. the full join should be dummy
explain (costs off)
select 1 from t full join
(select * from t t1 join
t t2 join t t3 on t2.a = t3.a
on true
where false) s on true
where false;
QUERY PLAN
--------------------------------------
Merge Full Join
-> Seq Scan on t
-> Materialize
-> Result
One-Time Filter: false
(5 rows)

-- wrong result
select 1 from t full join
(select * from t t1 join
t t2 join t t3 on t2.a = t3.a
on true
where false) s on true
where false;
?column?
----------
1
(1 row)

In this query there are two RestrictInfos whose clause are both
const-false: one is supposed to be evaluated at join t1/t3, and the
other is supposed to be evaluated above the full join. But the SJE
code mistakenly thinks that they are duplicates, so the one above the
full join is just abandoned.

Do we really need to avoid duplicates here? We do not do that before.
For instance,

explain (costs off)
select * from t where b > 1 and b > 1;
QUERY PLAN
---------------------------------
Seq Scan on t
Filter: ((b > 1) AND (b > 1))
(2 rows)

So I will not be surprised if I see redundant 'b > 1' after the join
removal in the following plan.

explain (costs off)
select * from t t1 join t t2 on t1.a = t2.a where t1.b > 1 and t2.b > 1;
QUERY PLAN
-----------------------------------------
Seq Scan on t t2
Filter: ((a IS NOT NULL) AND (b > 1))
(2 rows)

If we determine that avoiding duplicates is necessary, I think at least
we should compare the entire RestrictInfos not just their clauses. One
challenge with this approach is that the 'rinfo_serial' usually differs,
making direct comparison problematic. I'm wondering if we can make
'rinfo_serial' equal_ignore. Not too sure about that.

Attached is a patch to show my thoughts.

Thanks
Richard

Attachment

v1-0001-Fix-the-issue-that-SJE-mistakenly-omits-qual-clauses.patch

pgsql-bugs by date:

From: Richard Guo
Date: 28 December 2023, 06:36:46
Subject: Re: BUG #18259: Assertion in ExtendBufferedRelLocal() fails after no-space-left condition

From: aa
Date: 28 December 2023, 15:15:07
Subject: Out of the box, full text search feature suggestion for postgresql

Re: BUG #18261: Inconsistent results of SELECT affected by joined subqueries - Mailing list pgsql-bugs

Attachment

Previous

Next