Re: Remove traces of long in dynahash.c - Mailing list pgsql-hackers

From Dean Rasheed
Subject Re: Remove traces of long in dynahash.c
Date
Msg-id CAEZATCUdNY8LBRJ3_7DL4Hw0TJvPd_-nurtrH=f6i7cvj2xjOg@mail.gmail.com
Whole thread Raw
In response to Re: Remove traces of long in dynahash.c  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Remove traces of long in dynahash.c
List pgsql-hackers
On Tue, 9 Sept 2025 at 04:58, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Sep 05, 2025 at 10:40:46AM +0100, Dean Rasheed wrote:
> > Alternatively, why not just impose the upper bound at the call sites
> > if needed, so there may be no need for these functions at all. For
> > example, looking at nodeHash.c, it would seem much more logical to
> > have ExecChooseHashTableSize() put an upper bound on nbuckets, rather
> > than capping log2_nbuckets after nbuckets has been chosen, risking
> > them getting out-of-sync.
>
> Yep, that may be the best course of action.  As far as I can see, this
> is capped by palloc() and HashJoinTuple, so we should be OK with
> putting a hard limit at (INT_MAX / 2) and call it a day, I guess?

+1

In ExecChooseHashTableSize():

+    /*
+     * Cap nbuckets, for power of 2 calculations.  This maximum is safe
+     * for pg_ceil_log2_32().
+     */
+    if (nbuckets > PG_INT32_MAX / 2)
+        nbuckets = PG_INT32_MAX / 2;

That comment is not really right, because pg_ceil_log2_32() can accept
larger inputs than that. But in case, that cap is wrong because
nbuckets should always be a power of 2, and PG_INT32_MAX / 2 is 2^30 -
1.

Looking more closely, I don't think a cap is needed at all, given the
prior computations. Further up, there's this code:

    /* Also ensure we avoid integer overflow in nbatch and nbuckets */
    /* (this step is redundant given the current value of MaxAllocSize) */
    max_pointers = Min(max_pointers, INT_MAX / 2 + 1);

and in practice, it's constrained to be much less than that, based on
this earlier code:

    max_pointers = Min(max_pointers, MaxAllocSize / sizeof(HashJoinTuple));

So I think in theory that ensures that nbuckets can never get anywhere
near overflowing a 32-bit integer. Given that nbuckets is a 32-bit
signed integer, and it is a power of 2, it is automatically less than
or equal to 2^30, or else if somehow there was an error in the
preceding logic and an attempt had been made to make it larger than
that, integer wrap-around would have occurred (e.g., it might have
become -2^31), in which case the "Assert(nbuckets > 0)" would trap it.

So I think there's no point in adding that cap, or any additional
checks in ExecChooseHashTableSize().

Regards,
Dean



pgsql-hackers by date:

Previous
From: shveta malik
Date:
Subject: Re: Issue with logical replication slot during switchover
Next
From: Dean Rasheed
Date:
Subject: Re: [PATCH] Generate random dates/times in a specified range