Re: using index to speedup add not null constraints to a table - Mailing list pgsql-hackers

From jian he
Subject Re: using index to speedup add not null constraints to a table
Date
Msg-id CACJufxFae2f=_egVwGGvGFHMTU=Pu7Akb-e_JMEsUCsHRT75Jw@mail.gmail.com
Whole thread Raw
In response to Re: using index to speedup add not null constraints to a table  (Álvaro Herrera <alvherre@kurilemu.de>)
List pgsql-hackers
hi.

---------------------
create unlogged table t2_copy(col1 int, col2 int, col3 int, col4 int,
col5 int) with (autovacuum_enabled = off, vacuum_index_cleanup=off);
insert into t2_copy select g, g, g, g,g from generate_series(1, 10_000_000) g;
set enable_seqscan to off;
set enable_bitmapscan to off;
set max_parallel_workers=0;
set max_parallel_workers_per_gather=0;

create index t2_copy_idx on t2_copy(col1, col2, col3, col4, col5);
explain (costs off, analyze) select from t2_copy where col1 is NULL or
col2 is NULL or col3 is NULL or col4 is null or col5 is null \watch
i=0.1 c=10

drop index t2_copy_idx;
create index t2_copy_idx1 on t2_copy(col1);
create index t2_copy_idx2 on t2_copy(col2);
create index t2_copy_idx3 on t2_copy(col3);
create index t2_copy_idx4 on t2_copy(col4);
create index t2_copy_idx5 on t2_copy(col5);
explain (costs off, analyze) select from t2_copy where col1 is NULL or
col2 is NULL or col3 is NULL or col4 is null or col5 is null \watch
i=0.1 c=10
------------------------------------------
By comparing the above two EXPLAIN, I found out that there will be regression
for using one multiple column indexes fast verify mutiple not-null constraints.

create unlogged table t3_copy(col1 int, col2 int, col3 int, col4 int);
create index t3_copy_idx on t3_copy(col1, col2, col3, col4);
alter table t3_copy add not null col1, add not null col2, add not null
col3, add not null col4;

In this case, scanning the t3_copy_idx index to check these four NOT NULL
constraints is actually slower than a full table scan.

Therefore, I further restricted the use of the index-scan mechanism for fast NOT
NULL verification to scenarios where the attribute being checked is the leading
column of the index.

So, for the above example, to let

alter table t3_copy add not null col1, add not null col2, add not null
col3, add not null col4;

utilize indexscan mechanism, we require four indexes, each index leading column
is corresponding to the not-null constraint attribute.

I have not yet addressed all of your other comments, as this
represents a more of a
major change. Therefore, I am submitting the attached patch first.

index_check_notnull function, in v8 it is:

+ /* collect index attnums while loop */
+ for (int i = 0; i < index->indnkeyatts; i++)
+ {
+ attr = TupleDescAttr(tupdesc, (index->indkey.values[i] - 1));
+
+
+ }

in v9, it's:
+ /* collect index attnums while loop */
+ attr = TupleDescAttr(tupdesc, (index->indkey.values[0] - 1));
+

I am not sure that regression tests would be helpful, since the test hinges on
whether ereport(DEBUG1, ...) is reached.

I am not sure about the concurrency implications; therefore, isolation tests are
attached.  maybe it's not necessary.



--
jian
https://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Oleg Tselebrovskiy
Date:
Subject: Re: Proposal for enabling auto-vectorization for checksum calculations
Next
From: 洪伊
Date:
Subject: Re: [PATCH] pl: fix can not build free-thread for plpython extension like 3.14t