Home > mailing lists

Re: Partial hash index is not used for implied qual. - Mailing list pgsql-hackers

From	David Rowley
Subject	Re: Partial hash index is not used for implied qual.
Date	November 27 06:01:14
Msg-id	CAApHDvrsnR+=CU4wE1haWWO+ZjjN1m22joLqifrJgXv1mf+6yQ@mail.gmail.com Whole thread Raw
In response to	Re: Partial hash index is not used for implied qual. (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Partial hash index is not used for implied qual. Re: Partial hash index is not used for implied qual.
List	pgsql-hackers

Tree view

On Thu, 27 Nov 2025 at 07:26, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I checked the costing calculations and it's basically that
> genericcostestimate doesn't understand about hash buckets.
> For the partial index, it correctly estimates that we'll visit
> all 10 of the tuples in the index, so it supposes that that
> means we'll visit all of the index's pages.  For the non-partial
> index, it correctly estimates that we'll visit 10 of the 1000

I assume you must mean using your "x % 100" case rather than Sergei's case.

> tuples in the index, so it supposes that that means we'll visit
> 1/100th of the index's pages (rounded up to 1).  This error is
> compounded by the toy example, which only has 6 pages in either index
> (the minimum size of a hash index, I think).  There may or may not be
> anything we can usefully do to improve that situation

I'm not so clear on why this is bad. The index has 1000 tuples on 6
pages and we want to scan 1% of the rows in the index. As a result,
genericcostestimate() calculates that's 1 page. How many pages would
you expect to scan?

> ... and for that
> matter, it's not clear that preferring the partial index would really
> be a win.  As constructed, this test case has only one hash value in
> the partial index, which I think is not exactly a case where you want
> a hash index.  I tried scaling up the table size, and got a badly
> bloated partial index (about half as big as the non-partial one),
> which seems to indicate that the code is vainly splitting and
> re-splitting trying to divide that one huge bucket.

I assumed this was some attempt at finding a cheap way to find rows
matching the index's predicate without having to have an index on all
rows.

> So I'm inclined to apply the attached and just call it good.

I think the patch looks fine.

> Should we back-patch?  I'm unsure.  Clearly it's a bug that we
> cannot generate an indexscan plan in this case, but we've learned
> that changing plans in released branches is often not wanted.
> And given the lack of field complaints, nobody is using the case
> anyway.

I feel like anyone adding a partial hash index has done so quite
purposefully. I suspect they might be surprised if there's no means
whatsoever to use that index in scans, so perhaps it's ok to
backpatch.

Sergei, can you confirm if this was something he noticed when playing
around on master, or if this came from a field report?

David

pgsql-hackers by date:

From: Thomas Munro
Date: 27 November, 05:53:20
Subject: Re: Consistently use palloc_object() and palloc_array()

From: 邱宇航
Date: 27 November, 06:07:43
Subject: Re: Add pg_buffercache_mark_dirty[_all] functions to the pg_buffercache

Re: Partial hash index is not used for implied qual. - Mailing list pgsql-hackers

Previous

Next