Thread: Question on indexes

Question on indexes

From

sud

Date:

10 October 2024, 21:49:39

Hi,

I have never used any 'hash index' but saw documents in the past suggesting issues around hash index , like WAL doesnt generate for "hash index" which means we can't get the hash index back after crash also they are not applied to replicas etc. And also these indexes can not be used for range queries , for sorting etc.

However, we are seeing that one of the databases has multiple hash indexes created. So I wanted to understand from experts here, if it's advisable in any specific scenarios over B-tre despite such downsides?

Note- Its version 15.4 database.

Regards

Sud

Re: Question on indexes

From

Christophe Pettus

Date:

10 October 2024, 21:51:58


> On Oct 10, 2024, at 11:49, sud <suds1434@gmail.com> wrote:
>
> Hi,
> I have never used any 'hash index' but saw documents in the past suggesting issues around hash index , like WAL
doesntgenerate for "hash index" which means we can't get the hash index back after crash also they are not applied to
replicasetc. 

That's very old information.  Hash indexes are correctly WAL-logged since (IIRC) version 10.

Re: Question on indexes

From

Erik Wienhold

Date:

10 October 2024, 22:21:46

On 2024-10-10 20:49 +0200, sud wrote:
> However, we are seeing that one of the databases has multiple hash indexes
> created. So I wanted to understand from experts here, if it's advisable in
> any specific scenarios over B-tre despite such downsides?

Two things come to my mind:

1. Btree puts a limit on the size of indexed values, whereas hash
   indexes only store the 32-bit hash code.

2. Of the core index types, only btree supports unique indexes.

Example of btree's size limit:

    CREATE TABLE b (s text);
    CREATE INDEX ON b USING btree (s);
    INSERT INTO b (s) VALUES (repeat('x', 1000000));
    ERROR:  index row requires 11464 bytes, maximum size is 8191

The docs have more details:
https://www.postgresql.org/docs/current/btree.html
https://www.postgresql.org/docs/current/hash-index.html

-- 
Erik

Re: Question on indexes

From

sud

Date:

10 October 2024, 22:44:28

On Fri, Oct 11, 2024 at 12:51 AM Erik Wienhold <ewie@ewie.name> wrote:

On 2024-10-10 20:49 +0200, sud wrote:
> However, we are seeing that one of the databases has multiple hash indexes
> created. So I wanted to understand from experts here, if it's advisable in
> any specific scenarios over B-tre despite such downsides?

Two things come to my mind:

1. Btree puts a limit on the size of indexed values, whereas hash
indexes only store the 32-bit hash code.

2. Of the core index types, only btree supports unique indexes.

Example of btree's size limit:

CREATE TABLE b (s text);
CREATE INDEX ON b USING btree (s);
INSERT INTO b (s) VALUES (repeat('x', 1000000));
ERROR: index row requires 11464 bytes, maximum size is 8191

The docs have more details:
https://www.postgresql.org/docs/current/btree.html
https://www.postgresql.org/docs/current/hash-index.html

Thank you.

Not yet confirmed, but actually somehow we see the DB crashed repetitively a few times and teammates suspecting the cause while it tried extending this hash index. Did you experience any such thing with hash index? However, as you mentioned ,if we have any column with large string/text values and we want it to be indexed then there is no choice but to go for a hash index. Please correct me if I'm wrong.

Re: Question on indexes

From

Erik Wienhold

Date:

10 October 2024, 23:36:47

On 2024-10-10 21:44 +0200, sud wrote:
> Not yet confirmed, but actually somehow we see the DB crashed repetitively
> a few times and teammates suspecting the cause while it tried extending
> this hash index.

Your first mail says that you're using version 15.4.  You should
consider upgrading to 15.8 to get the latest bugfixes.

> Did you experience any such thing with hash index?

No.  But I can't remember ever seeing a hash index in the databases that
I've worked on.

> However, as you mentioned ,if we have any column with large string/text
> values and we want it to be indexed then there is no choice but to go for a
> hash index. Please correct me if I'm wrong.

Define "large".

What kind of text (natural, JSON, XML, base64, DNA sequences, etc.) is
stored in those columns?  Why do you want/need to index those columns?
Because hash indexes only support the equal operator, one can only use
that index to search for exact matches (i.e. values with identical hash
code) which I find strange for values that are so large that btree
cannot be used.  But maybe you have solid use case for that.

If it's natural text and you're using tsvector for full-text search,
then GiST or GIN indexes are a better choice:
https://www.postgresql.org/docs/current/textsearch-indexes.html

-- 
Erik

Re: Question on indexes

From

"Efrain J. Berdecia"

Date:

11 October 2024, 04:17:54

They are extremely efficient for joins!!!

Yahoo Mail: Search, Organize, Conquer

On Thu, Oct 10, 2024 at 2:52 PM, Christophe Pettus
<xof@thebuild.com> wrote:

> On Oct 10, 2024, at 11:49, sud <suds1434@gmail.com> wrote:
>
> Hi,
> I have never used any 'hash index' but saw documents in the past suggesting issues around hash index , like WAL doesnt generate for "hash index" which means we can't get the hash index back after crash also they are not applied to replicas etc.

That's very old information. Hash indexes are correctly WAL-logged since (IIRC) version 10.

Re: Question on indexes

From

Laurenz Albe

Date:

11 October 2024, 09:43:37

On Fri, 2024-10-11 at 00:19 +0530, sud wrote:
> I have never used any 'hash index' but saw documents in the past suggesting issues
> around hash index , like WAL doesnt generate for "hash index" which means we can't
> get the hash index back after crash also they are not applied to replicas etc.
> And also these indexes can not be used for range queries , for sorting etc.
>
> However, we are seeing that one of the databases has multiple hash indexes created.
> So I wanted to understand from experts here, if it's advisable in any specific
> scenarios over B-tre despite such downsides?
> Note- Its version 15.4 database.

It is safe to use them, but in my tests I didn't find a realistic case where the were
better than a B-tree index:
https://www.cybertec-postgresql.com/en/postgresql-hash-index-performance/

Keep them if they do the trick for you, but I'd use B-tree indexes instead.

Yours,
Laurenz Albe

Re: Question on indexes

From

Greg Sabino Mullane

Date:

11 October 2024, 14:29:15

if we have any column with large string/text values and we want it to be indexed then there is no choice but to go for a hash index. Please correct me if I'm wrong.

There are other strategies / solutions, but we would need to learn more about your use case.

Cheers,

Greg

Re: Question on indexes

From

Durgamahesh Manne

Date:

11 October 2024, 15:31:58

On Fri, Oct 11, 2024 at 5:00 PM Greg Sabino Mullane <htamfids@gmail.com> wrote:

if we have any column with large string/text values and we want it to be indexed then there is no choice but to go for a hash index. Please correct me if I'm wrong.

There are other strategies / solutions, but we would need to learn more about your use case.

Cheers,
Greg

Hi Respected Team

How do we enforce the secondary column of composite index to index scan on concurrent activity in postgres?

Second column of composite index not in use effectively with index scan when using second column at where clause

I have composite index on (placedon,id) of test

When querying select * from test where id = '4234';

Value of id changes and during concurrent activity and cpu utilization increased too much that i have observed which means query plan changed why

I could see index scan with explain for it on singal call or double calls

Is there any way to keep an index scan for it during concurrency rather than a separate index on the second column of the composite index ?

Regards,

Durga Mahesh

Re: Question on indexes

From

Greg Sabino Mullane

Date:

11 October 2024, 15:48:19

(please start a new thread in the future rather than replying to an existing one)

You cannot query on b and use an index on (a,b) as you observed. However, you can have two indexes:

index1(a)

index2(b)

Postgres will be able to combine those when needed in the case where your WHERE clause needs to filter by both columns. So then you no longer need the two-column index.

Cheers,

Greg

Re: Question on indexes

From

Durgamahesh Manne

Date:

11 October 2024, 16:37:30

On Fri, Oct 11, 2024 at 6:18 PM Greg Sabino Mullane <htamfids@gmail.com> wrote:

(please start a new thread in the future rather than replying to an existing one)

You cannot query on b and use an index on (a,b) as you observed. However, you can have two indexes:

index1(a)
index2(b)

Postgres will be able to combine those when needed in the case where your WHERE clause needs to filter by both columns. So then you no longer need the two-column index.

Cheers,
Greg

Hi greg

Mail sent you with a new thread. composite key is on partitioned table

Regards,

Durga Mahesh