Thread: Question on indexes

Question on indexes

From
sud
Date:
Hi,
I have never used any 'hash index' but saw documents in the past suggesting issues around hash index , like WAL doesnt generate for "hash index" which means we can't get the hash index back after crash also they are not applied to replicas etc. And also these indexes can not be used for range queries , for sorting etc.

However, we are seeing that one of the databases has multiple hash indexes created. So I wanted to understand from experts here, if it's advisable in any specific scenarios over B-tre despite such downsides?
Note- Its version 15.4 database.

Regards
Sud

Re: Question on indexes

From
Christophe Pettus
Date:

> On Oct 10, 2024, at 11:49, sud <suds1434@gmail.com> wrote:
>
> Hi,
> I have never used any 'hash index' but saw documents in the past suggesting issues around hash index , like WAL
doesntgenerate for "hash index" which means we can't get the hash index back after crash also they are not applied to
replicasetc. 

That's very old information.  Hash indexes are correctly WAL-logged since (IIRC) version 10.


Re: Question on indexes

From
Erik Wienhold
Date:
On 2024-10-10 20:49 +0200, sud wrote:
> However, we are seeing that one of the databases has multiple hash indexes
> created. So I wanted to understand from experts here, if it's advisable in
> any specific scenarios over B-tre despite such downsides?

Two things come to my mind:

1. Btree puts a limit on the size of indexed values, whereas hash
   indexes only store the 32-bit hash code.

2. Of the core index types, only btree supports unique indexes.

Example of btree's size limit:

    CREATE TABLE b (s text);
    CREATE INDEX ON b USING btree (s);
    INSERT INTO b (s) VALUES (repeat('x', 1000000));
    ERROR:  index row requires 11464 bytes, maximum size is 8191

The docs have more details:
https://www.postgresql.org/docs/current/btree.html
https://www.postgresql.org/docs/current/hash-index.html

-- 
Erik



Re: Question on indexes

From
sud
Date:

On Fri, Oct 11, 2024 at 12:51 AM Erik Wienhold <ewie@ewie.name> wrote:
On 2024-10-10 20:49 +0200, sud wrote:
> However, we are seeing that one of the databases has multiple hash indexes
> created. So I wanted to understand from experts here, if it's advisable in
> any specific scenarios over B-tre despite such downsides?

Two things come to my mind:

1. Btree puts a limit on the size of indexed values, whereas hash
   indexes only store the 32-bit hash code.

2. Of the core index types, only btree supports unique indexes.

Example of btree's size limit:

    CREATE TABLE b (s text);
    CREATE INDEX ON b USING btree (s);
    INSERT INTO b (s) VALUES (repeat('x', 1000000));
    ERROR:  index row requires 11464 bytes, maximum size is 8191

The docs have more details:
https://www.postgresql.org/docs/current/btree.html
https://www.postgresql.org/docs/current/hash-index.html


Thank you.

Not yet confirmed, but actually somehow we see the DB crashed repetitively a few times and teammates suspecting the cause while it tried extending this hash index. Did you experience any such thing with hash index? However, as you mentioned ,if we have any column with large string/text values and we want it to be indexed then there is no choice but to go for a hash index. Please correct me if I'm wrong.

Re: Question on indexes

From
Erik Wienhold
Date:
On 2024-10-10 21:44 +0200, sud wrote:
> Not yet confirmed, but actually somehow we see the DB crashed repetitively
> a few times and teammates suspecting the cause while it tried extending
> this hash index.

Your first mail says that you're using version 15.4.  You should
consider upgrading to 15.8 to get the latest bugfixes.

> Did you experience any such thing with hash index?

No.  But I can't remember ever seeing a hash index in the databases that
I've worked on.

> However, as you mentioned ,if we have any column with large string/text
> values and we want it to be indexed then there is no choice but to go for a
> hash index. Please correct me if I'm wrong.

Define "large".

What kind of text (natural, JSON, XML, base64, DNA sequences, etc.) is
stored in those columns?  Why do you want/need to index those columns?
Because hash indexes only support the equal operator, one can only use
that index to search for exact matches (i.e. values with identical hash
code) which I find strange for values that are so large that btree
cannot be used.  But maybe you have solid use case for that.

If it's natural text and you're using tsvector for full-text search,
then GiST or GIN indexes are a better choice:
https://www.postgresql.org/docs/current/textsearch-indexes.html

-- 
Erik



Re: Question on indexes

From
"Efrain J. Berdecia"
Date:
They are extremely efficient for joins!!!


On Thu, Oct 10, 2024 at 2:52 PM, Christophe Pettus
<xof@thebuild.com> wrote:


> On Oct 10, 2024, at 11:49, sud <suds1434@gmail.com> wrote:
>
> Hi,
> I have never used any 'hash index' but saw documents in the past suggesting issues around hash index , like WAL doesnt generate for "hash index" which means we can't get the hash index back after crash also they are not applied to replicas etc.


That's very old information.  Hash indexes are correctly WAL-logged since (IIRC) version 10.


Re: Question on indexes

From
Laurenz Albe
Date:
On Fri, 2024-10-11 at 00:19 +0530, sud wrote:
> I have never used any 'hash index' but saw documents in the past suggesting issues
> around hash index , like WAL doesnt generate for "hash index" which means we can't
> get the hash index back after crash also they are not applied to replicas etc.
> And also these indexes can not be used for range queries , for sorting etc.
>
> However, we are seeing that one of the databases has multiple hash indexes created.
> So I wanted to understand from experts here, if it's advisable in any specific
> scenarios over B-tre despite such downsides?
> Note- Its version 15.4 database.

It is safe to use them, but in my tests I didn't find a realistic case where the were
better than a B-tree index:
https://www.cybertec-postgresql.com/en/postgresql-hash-index-performance/

Keep them if they do the trick for you, but I'd use B-tree indexes instead.

Yours,
Laurenz Albe



Re: Question on indexes

From
Greg Sabino Mullane
Date:
if we have any column with large string/text values and we want it to be indexed then there is no choice but to go for a hash index. Please correct me if I'm wrong.

There are other strategies / solutions, but we would need to learn more about your use case.

Cheers,
Greg
 

Re: Question on indexes

From
Durgamahesh Manne
Date:


On Fri, Oct 11, 2024 at 5:00 PM Greg Sabino Mullane <htamfids@gmail.com> wrote:
if we have any column with large string/text values and we want it to be indexed then there is no choice but to go for a hash index. Please correct me if I'm wrong.

There are other strategies / solutions, but we would need to learn more about your use case.

Cheers,
Greg
 

Hi Respected Team

How do we enforce the secondary column of composite index to index scan on concurrent activity in postgres?  
Second column of composite index not in use effectively with index scan  when using second column at where clause

I have composite index on (placedon,id) of test 
When querying  select * from test where id = '4234';
Value of id changes and during concurrent activity and cpu utilization increased too much  that i have observed which means query plan changed why

I could see index scan with explain for it  on singal call or double calls

Is there any way to keep an index scan for it during concurrency rather than a separate index on the second column of the composite index ?


Regards,
Durga Mahesh  

Re: Question on indexes

From
Greg Sabino Mullane
Date:
(please start a new thread in the future rather than replying to an existing one)

You cannot query on b and use an index on (a,b) as you observed. However, you can have two indexes:

index1(a)
index2(b)

Postgres will be able to combine those when needed in the case where your WHERE clause needs to filter by both columns. So then you no longer need the two-column index.

Cheers,
Greg


Re: Question on indexes

From
Durgamahesh Manne
Date:


On Fri, Oct 11, 2024 at 6:18 PM Greg Sabino Mullane <htamfids@gmail.com> wrote:
(please start a new thread in the future rather than replying to an existing one)

You cannot query on b and use an index on (a,b) as you observed. However, you can have two indexes:

index1(a)
index2(b)

Postgres will be able to combine those when needed in the case where your WHERE clause needs to filter by both columns. So then you no longer need the two-column index.

Cheers,
Greg

 
Hi greg 

Mail sent you with a new thread. composite key is on partitioned table 

Regards,
Durga Mahesh