Thread: Relaxing constraints on BitmapAnd eligibility?
Hi!
I've been investigating why postgres does not do BitmapAnd of two well-suited indexes, and reading indxpath.c
In my case, there is a table (d date, col1 int, col2 int) -- types not really important -- and there are two indices on (d,col1) and (d, col2).
For queries that do WHERE d>=X AND col1=Y AND col2=Z postgres will never BitmapAnd those two indices because both indexes include (d) and we have a condition on (d). Here is a full example, which could also be seen here: https://www.db-fiddle.com/f/uPLx5bRtDEoZw3Dx4kkwKh/0:
begin;
CREATE TABLE test_table (
d date,
col1 int,
col2 int
);
INSERT INTO test_table (d, col1, col2)
SELECT
d.date,
c1.val as col1,
c2.val as col2
FROM
generate_series(
'2023-01-01'::date,
'2023-12-31'::date,
'1 day'::interval
) as d(date),
generate_series(1, 1000) as c1(val),
generate_series(1, 1000) as c2(val)
WHERE
random() < 0.001;
create index on test_table(col1,d);
create index on test_table(col2,d);
CREATE TABLE test_table (
d date,
col1 int,
col2 int
);
INSERT INTO test_table (d, col1, col2)
SELECT
d.date,
c1.val as col1,
c2.val as col2
FROM
generate_series(
'2023-01-01'::date,
'2023-12-31'::date,
'1 day'::interval
) as d(date),
generate_series(1, 1000) as c1(val),
generate_series(1, 1000) as c2(val)
WHERE
random() < 0.001;
create index on test_table(col1,d);
create index on test_table(col2,d);
-- This uses BitmapAnd
explain select * from test_table where col1=123 and col2=321;
-- This does not use BitmapAnd, even though it could!
explain select * from test_table where col1=123 and col2=321 and d >= '2023-05-05';
-- This does not use BitmapAnd, even though it could!
explain select * from test_table where col1=123 and col2=321 and d >= '2023-05-05';
I checked that BitmapAnd is rejected by this line in choose_bitmap_and:
if (bms_overlap(pathinfo->clauseids, clauseidsofar))
continue; /* consider it redundant */
continue; /* consider it redundant */
There is a comment on choose_bitmap_and that explains the rationale of this check, but reading it I can't help but feel that what the comment describes is this condition:
if (bms_is_subset(pathinfo->clauseids, clauseidsofar))
continue; /* consider it redundant */
continue; /* consider it redundant */
And indeed, in my (admittedly not super-extensive) testing changing bms_overlap to bms_is_subset leads to better faster execution plans.
Is it possible that this condition could thus be relaxed?
Even if I am wrong, and the condition absolutely should be bms_overlap, I feel that this restriction is very very hard to discover and perhaps https://www.postgresql.org/docs/current/indexes-bitmap-scans.html should mention that compound indexes that have columns in common will never be combined?
Best regards, Dmytro
Hi!
I am (still) very unsure if the code change I mentioned will make sense, but documentation chage could perhaps look like something along these lines?
Best regards, Dmytro
On Tue, Feb 25, 2025 at 9:14 PM Dmytro Astapov <dastapov@gmail.com> wrote:
Hi!I've been investigating why postgres does not do BitmapAnd of two well-suited indexes, and reading indxpath.cIn my case, there is a table (d date, col1 int, col2 int) -- types not really important -- and there are two indices on (d,col1) and (d, col2).For queries that do WHERE d>=X AND col1=Y AND col2=Z postgres will never BitmapAnd those two indices because both indexes include (d) and we have a condition on (d). Here is a full example, which could also be seen here: https://www.db-fiddle.com/f/uPLx5bRtDEoZw3Dx4kkwKh/0:begin;
CREATE TABLE test_table (
d date,
col1 int,
col2 int
);
INSERT INTO test_table (d, col1, col2)
SELECT
d.date,
c1.val as col1,
c2.val as col2
FROM
generate_series(
'2023-01-01'::date,
'2023-12-31'::date,
'1 day'::interval
) as d(date),
generate_series(1, 1000) as c1(val),
generate_series(1, 1000) as c2(val)
WHERE
random() < 0.001;
create index on test_table(col1,d);
create index on test_table(col2,d);-- This uses BitmapAndexplain select * from test_table where col1=123 and col2=321;
-- This does not use BitmapAnd, even though it could!
explain select * from test_table where col1=123 and col2=321 and d >= '2023-05-05';I checked that BitmapAnd is rejected by this line in choose_bitmap_and:if (bms_overlap(pathinfo->clauseids, clauseidsofar))
continue; /* consider it redundant */There is a comment on choose_bitmap_and that explains the rationale of this check, but reading it I can't help but feel that what the comment describes is this condition:if (bms_is_subset(pathinfo->clauseids, clauseidsofar))
continue; /* consider it redundant */And indeed, in my (admittedly not super-extensive) testing changing bms_overlap to bms_is_subset leads to better faster execution plans.Is it possible that this condition could thus be relaxed?Even if I am wrong, and the condition absolutely should be bms_overlap, I feel that this restriction is very very hard to discover and perhaps https://www.postgresql.org/docs/current/indexes-bitmap-scans.html should mention that compound indexes that have columns in common will never be combined?Best regards, Dmytro
Attachment
Hi
And here is a proposed code change, alternative to the doc change.
I hope that it is ok to relax the restriction like so, as `cost_bitmap_node_and` is already resigned to inputs not being independent:
* We estimate AND selectivity on the assumption that the inputs are
* independent. This is probably often wrong, but we don't have the info
* to do better.
* independent. This is probably often wrong, but we don't have the info
* to do better.
I've ran "make check", and one test have changed (the last one in the "test extraction of restriction OR clauses from join OR clause" group - in an expected way, IMO).
I am attaching the regression.diff for convenience.
Is this a generally desirable change to pursue?
Best regards, Dmytro
On Wed, Feb 26, 2025 at 7:31 PM Dmytro Astapov <dastapov@gmail.com> wrote:
Hi!I am (still) very unsure if the code change I mentioned will make sense, but documentation chage could perhaps look like something along these lines?Best regards, DmytroOn Tue, Feb 25, 2025 at 9:14 PM Dmytro Astapov <dastapov@gmail.com> wrote:Hi!I've been investigating why postgres does not do BitmapAnd of two well-suited indexes, and reading indxpath.cIn my case, there is a table (d date, col1 int, col2 int) -- types not really important -- and there are two indices on (d,col1) and (d, col2).For queries that do WHERE d>=X AND col1=Y AND col2=Z postgres will never BitmapAnd those two indices because both indexes include (d) and we have a condition on (d). Here is a full example, which could also be seen here: https://www.db-fiddle.com/f/uPLx5bRtDEoZw3Dx4kkwKh/0:begin;
CREATE TABLE test_table (
d date,
col1 int,
col2 int
);
INSERT INTO test_table (d, col1, col2)
SELECT
d.date,
c1.val as col1,
c2.val as col2
FROM
generate_series(
'2023-01-01'::date,
'2023-12-31'::date,
'1 day'::interval
) as d(date),
generate_series(1, 1000) as c1(val),
generate_series(1, 1000) as c2(val)
WHERE
random() < 0.001;
create index on test_table(col1,d);
create index on test_table(col2,d);-- This uses BitmapAndexplain select * from test_table where col1=123 and col2=321;
-- This does not use BitmapAnd, even though it could!
explain select * from test_table where col1=123 and col2=321 and d >= '2023-05-05';I checked that BitmapAnd is rejected by this line in choose_bitmap_and:if (bms_overlap(pathinfo->clauseids, clauseidsofar))
continue; /* consider it redundant */There is a comment on choose_bitmap_and that explains the rationale of this check, but reading it I can't help but feel that what the comment describes is this condition:if (bms_is_subset(pathinfo->clauseids, clauseidsofar))
continue; /* consider it redundant */And indeed, in my (admittedly not super-extensive) testing changing bms_overlap to bms_is_subset leads to better faster execution plans.Is it possible that this condition could thus be relaxed?Even if I am wrong, and the condition absolutely should be bms_overlap, I feel that this restriction is very very hard to discover and perhaps https://www.postgresql.org/docs/current/indexes-bitmap-scans.html should mention that compound indexes that have columns in common will never be combined?Best regards, Dmytro