Re: Partial hash index is not used for implied qual. - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Partial hash index is not used for implied qual.
Date
Msg-id 1615744.1764181557@sss.pgh.pa.us
Whole thread Raw
In response to Re: Partial hash index is not used for implied qual.  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Partial hash index is not used for implied qual.
List pgsql-hackers
David Rowley <dgrowleyml@gmail.com> writes:
> On Tue, 25 Nov 2025 at 15:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Actually, after thinking a bit longer, it'd be better to do something
>> like the attached so that we don't keep redundant quals unless they'd
>> *all* be excluded.

> I think your 1st patch was more along the correct lines. With the 2nd
> one, I think you're maybe assuming that the non-empty newrestrictinfo
> must contain an indexable qual, but the list we're working with at
> that point is the rel's baserestrictinfo, which could contain a bunch
> of stuff that'll never match to the index. If you continue to remove
> the implied qual when there's a non-indexable base qual in the list,
> then we'll still have the same issue that Sergei is trying to fix.

Right, as Sergei also noted.  We should just do it as attached.

I checked the costing calculations and it's basically that
genericcostestimate doesn't understand about hash buckets.
For the partial index, it correctly estimates that we'll visit
all 10 of the tuples in the index, so it supposes that that
means we'll visit all of the index's pages.  For the non-partial
index, it correctly estimates that we'll visit 10 of the 1000
tuples in the index, so it supposes that that means we'll visit
1/100th of the index's pages (rounded up to 1).  This error is
compounded by the toy example, which only has 6 pages in either index
(the minimum size of a hash index, I think).  There may or may not be
anything we can usefully do to improve that situation ... and for that
matter, it's not clear that preferring the partial index would really
be a win.  As constructed, this test case has only one hash value in
the partial index, which I think is not exactly a case where you want
a hash index.  I tried scaling up the table size, and got a badly
bloated partial index (about half as big as the non-partial one),
which seems to indicate that the code is vainly splitting and
re-splitting trying to divide that one huge bucket.

So I'm inclined to apply the attached and just call it good.

Should we back-patch?  I'm unsure.  Clearly it's a bug that we
cannot generate an indexscan plan in this case, but we've learned
that changing plans in released branches is often not wanted.
And given the lack of field complaints, nobody is using the case
anyway.

            regards, tom lane

diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index c62e3f87724..2654c59c4c6 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -4051,6 +4051,16 @@ check_index_predicates(PlannerInfo *root, RelOptInfo *rel)
         if (is_target_rel)
             continue;

+        /*
+         * If index is !amoptionalkey, also leave indrestrictinfo as set
+         * above.  Otherwise we risk removing all quals for the first index
+         * key and then not being able to generate an indexscan at all.  It
+         * would be better to be more selective, but we've not yet identified
+         * which if any of the quals match the first index key.
+         */
+        if (!index->amoptionalkey)
+            continue;
+
         /* Else compute indrestrictinfo as the non-implied quals */
         index->indrestrictinfo = NIL;
         foreach(lcr, rel->baserestrictinfo)

pgsql-hackers by date:

Previous
From: Michael Banck
Date:
Subject: Re: [PATCH] Expose checkpoint timestamp and duration in pg_stat_checkpointer
Next
From: Tomas Vondra
Date:
Subject: Re: should we have a fast-path planning for OLTP starjoins?