Re: Fix overflow of nbatch - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Fix overflow of nbatch
Date
Msg-id 5e073665-a73b-42c3-a483-253ab7a6fd87@vondra.me
Whole thread Raw
In response to Re: Fix overflow of nbatch  (Melanie Plageman <melanieplageman@gmail.com>)
Responses Re: Fix overflow of nbatch
List pgsql-hackers

On 10/8/25 21:16, Melanie Plageman wrote:
> On Wed, Oct 8, 2025 at 1:37 PM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
>>
>> I have updated my patch to fix the mistakes above. I also noticed then
>> that I wasn't doubling space_allowed in the loop but instead setting
>> it to hash_table_bytes at the end. This doesn't produce a power of 2
>> because we subtract skew_mcvs from the hash_table_bytes. So, we have
>> to keep using space_allowed if we want a power of 2 in the end.
>>
>> I've changed my patch to do this, but this made me wonder if we want
>> to be doing this or instead take hash_table_bytes at the end and round
>> it up to a power of 2 and set space_allowed to that. If the skew
>> hashtable is large, we may be allocating way more space_allowed than
>> we need for new hash_table_bytes + skew hashtable buckets.
> 

I don't think there's any promise hash_table_bytes being a power of 2.
You can make hash_table_bytes an almost arbitrary value by setting
work_mem and hash_mem_multiplier. Or am I missing something?

But you're right hash_table_bytes and space_allowed may not be equal if
useskew=true. So setting space_allowed to hash_table_bytes at the end
does not seem right. I think we don't actually need hash_table_bytes at
this point, we can just ignore it, and use/double *space_allowed.

I kept using hash_table_bytes mostly because it didn't require the
pointer dereferencing, but I failed to consider the useskew=true thing.

However, this means there's probably a bug - the loop should probably
double num_skew_mcvs too. We simply reserve SKEW_HASH_MEM_PERCENT of
space_allowed for skew hashtable, so should we adjust it the same way?

> Oh wait, that doesn't make sense because each batch could have a skew hashtable.
> 

Not sure I understand. Is this the same issue I just described?

regards

-- 
Tomas Vondra




pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Should we update the random_page_cost default value?
Next
From: Jacob Champion
Date:
Subject: Re: Thoughts on a "global" client configuration?