On Fri, Sep 05, 2025 at 10:40:46AM +0100, Dean Rasheed wrote:
> Alternatively, why not just impose the upper bound at the call sites
> if needed, so there may be no need for these functions at all. For
> example, looking at nodeHash.c, it would seem much more logical to
> have ExecChooseHashTableSize() put an upper bound on nbuckets, rather
> than capping log2_nbuckets after nbuckets has been chosen, risking
> them getting out-of-sync.
Yep, that may be the best course of action. As far as I can see, this
is capped by palloc() and HashJoinTuple, so we should be OK with
putting a hard limit at (INT_MAX / 2) and call it a day, I guess?
The two other call sites of my_log2() are worker.c, for the number of
subxacts, which relies on int32. The other call site is nodeAgg.c,
capped at HASHAGG_MAX_PARTITIONS (1024).
As of the attached, dynahash.h can be removed, which is the minimal
goal I had in mind. I am not sure about the need to tweak more
dynahash.c, as we've relied on long in this file for many years. We
could bite the bullet and do it, of course, but I am not sure.. So
I would be happy with only the attached changes.
What do you think?
--
Michael