BUG #18442: Unnecessary Sort operator in indexScan Plan - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #18442: Unnecessary Sort operator in indexScan Plan
Date
Msg-id 18442-aca4c9134416990e@postgresql.org
Whole thread Raw
Responses Re: BUG #18442: Unnecessary Sort operator in indexScan Plan
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      18442
Logged by:          yajun Hu
Email address:      hu_yajun@qq.com
PostgreSQL version: 14.11
Operating system:   CentOS7 with kernel version 5.10
Description:

I have reproduced this problem in REL_14_11 and the latest master branch
(84db9a0eb10dd1dbee6db509c0e427fa237177dc).
The steps to reproduce are as follows.
1. ./configure  --enable-debug --enable-depend --enable-cassert CFLAGS=-O0
2. make -j; make install -j; initdb -D ./primary; pg_ctl -D ../primary -l
logfile start
3. run SQL:
```
create table t( a int, b int);
insert into t select null,i from generate_series(1,100)i;
insert into t select i,i from generate_series(1,100000)i;
analyze t;
create index on t(a,b);
postgres=# explain select * from t where a is null order by b; -- need
sort
                                   QUERY PLAN
--------------------------------------------------------------------------------
 Sort  (cost=9.54..9.80 rows=103 width=8)
   Sort Key: b
   ->  Index Only Scan using t_a_b_idx on t  (cost=0.29..6.10 rows=103
width=8)
         Index Cond: (a IS NULL)
(4 rows)

postgres=# explain select * from t where a is null order by a, b; -- no need
sort
                                QUERY PLAN
--------------------------------------------------------------------------
 Index Only Scan using t_a_b_idx on t  (cost=0.29..6.10 rows=103 width=8)
   Index Cond: (a IS NULL)
(2 rows)

postgres=# explain select * from t where a = 1 order by b; -- no need sort
                               QUERY PLAN
------------------------------------------------------------------------
 Index Only Scan using t_a_b_idx on t  (cost=0.29..4.31 rows=1 width=8)
   Index Cond: (a = 1)
(2 rows)
```

In my understanding, in the first SELECT, because a is always NULL, the
scanned
data access by IndexOnlyScan is sorted according to b, which means that the
upper
Sort operator is unnecessary overhead.The second and third SELECT are both
as
expected. 

I tried to analyze the code and found that the EquivalenceClass of column a
and NULL
was missing, which caused build_index_pathkeys to return NIL. No pathkeys
makes the
optimizer decide that the upper layer needed Sort to ensure that the data
was in order.
I roughly know that it may be because NullTest in the check_mergejoinable
function is
not OpExpr. Is it possible here to generate special EquivalenceClass for
column a and
NULL to solve this problem?

I’m looking forward to someone answering my confusion, thank you very much!


pgsql-bugs by date:

Previous
From: Jerry Sievert
Date:
Subject: Re: CVE-2024-28849
Next
From: Shlok Kyal
Date:
Subject: Re: BUG #18433: Logical replication timeout