(Starting a new thread to not distract from the original)
In the plan advice patch, Robert noted (for the archives, see [1]):
On Tue, Jan 13, 2026 at 10:09 PM Robert Haas <robertmhaas@gmail.com> wrote:
> > + /*
> > + * hashfn_unstable.h recommends using string length as tweak. It's not
> > + * clear to me what to do if there are multiple strings, so for now I'm
> > + * just using the total of all of the lengths.
> > + */
> > + return fasthash_final32(&hs, sp_len);
> Some kind of comment change here seems useful to me. I wonder whether
> it should be generalized even more than this statement. I also wonder
> if this is really the optimal strategy. But I definitely agree that
> clarifying this in whatever way makes sense is a good idea.
As for the optimal strategy, that may be something that maintains
uniqueness when combining lengths, like this:
lengths = len1 + (len2 << 10) + (len3 << 20);
hashcode = fasthash_final32(&hs, lengths);
In the attached, added this as "probably safest", since this is an
educated guess.
I also generalized to variable length inputs where possible.
Lastly, some other comments were outdated or could use better organization.
[1] https://www.postgresql.org/message-id/CA%2BTgmoa6iQAz-RmAd3tWc%3DRgr6beZDetZuA7o298tn%3D6prLhsA%40mail.gmail.com
--
John Naylor
Amazon Web Services