Update some comments for fasthash - Mailing list pgsql-hackers

From John Naylor
Subject Update some comments for fasthash
Date
Msg-id CANWCAZa-2mEUY27xBw2TpsybpvVu3Ez4ABrHCBqZpAs_UDTj2Q@mail.gmail.com
Whole thread Raw
Responses Re:Update some comments for fasthash
List pgsql-hackers
(Starting a new thread to not distract from the original)

In the plan advice patch, Robert noted (for the archives, see [1]):

On Tue, Jan 13, 2026 at 10:09 PM Robert Haas <robertmhaas@gmail.com> wrote:
> > + /*
> > + * hashfn_unstable.h recommends using string length as tweak. It's not
> > + * clear to me what to do if there are multiple strings, so for now I'm
> > + * just using the total of all of the lengths.
> > + */
> > + return fasthash_final32(&hs, sp_len);

> Some kind of comment change here seems useful to me. I wonder whether
> it should be generalized even more than this statement. I also wonder
> if this is really the optimal strategy. But I definitely agree that
> clarifying this in whatever way makes sense is a good idea.

As for the optimal strategy, that may be something that maintains
uniqueness when combining lengths, like this:

lengths = len1 + (len2 << 10) + (len3 << 20);
hashcode = fasthash_final32(&hs, lengths);

In the attached, added this as "probably safest", since this is an
educated guess.

I also generalized to variable length inputs where possible.

Lastly, some other comments were outdated or could use better organization.

[1] https://www.postgresql.org/message-id/CA%2BTgmoa6iQAz-RmAd3tWc%3DRgr6beZDetZuA7o298tn%3D6prLhsA%40mail.gmail.com

--
John Naylor
Amazon Web Services

Attachment

pgsql-hackers by date:

Previous
From: Bertrand Drouvot
Date:
Subject: Re: Safer hash table initialization macro
Next
From: Álvaro Herrera
Date:
Subject: Re: Segmentation fault on proc exit after dshash_find_or_insert