Re: Partial hash index is not used for implied qual. - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Partial hash index is not used for implied qual.
Date
Msg-id 36827.1764216283@sss.pgh.pa.us
Whole thread Raw
In response to Re: Partial hash index is not used for implied qual.  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-hackers
David Rowley <dgrowleyml@gmail.com> writes:
> On Thu, 27 Nov 2025 at 07:26, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I checked the costing calculations and it's basically that
>> genericcostestimate doesn't understand about hash buckets.

> I assume you must mean using your "x % 100" case rather than Sergei's case.

Right.

>> ... tuples in the index, so it supposes that that means we'll visit
>> 1/100th of the index's pages (rounded up to 1).

> I'm not so clear on why this is bad. The index has 1000 tuples on 6
> pages and we want to scan 1% of the rows in the index. As a result,
> genericcostestimate() calculates that's 1 page. How many pages would
> you expect to scan?

To be clear, neither of genericcostestimate's estimates are bad as
a generic estimate.  And I'm not even sure that we could do better
with a hash-aware estimate implemented in hashcostestimate.  I think
the key thing about this test case is that the partial index ends
up with all its entries in one hash bucket.  I'm not sure that we
could reasonably expect to know that in the cost estimator.  Even
if we could, should we really expend a lot of effort on the case?
It's a textbook example of when you should not use a hash index.

I'm interested in fixing this can't-generate-this-plan-shape bug
because it probably impacts more-realistic use-cases.  But I
think the cost estimation problem is probably specific to cases
where you shouldn't have used a hash index, so I'm okay with
ignoring that part.  Until more evidence arrives, anyway.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Shlok Kyal
Date:
Subject: Re: How can end users know the cause of LR slot sync delays?
Next
From: Michael Paquier
Date:
Subject: Re: Add pg_buffercache_mark_dirty[_all] functions to the pg_buffercache