Home > mailing lists

Re: tweaking NTUP_PER_BUCKET - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: tweaking NTUP_PER_BUCKET
Date	July 8, 2014 18:28:58
Msg-id	53BC3859.5080006@fuzzy.cz Whole thread Raw
In response to	Re: tweaking NTUP_PER_BUCKET (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: tweaking NTUP_PER_BUCKET
List	pgsql-hackers

Tree view

On 8.7.2014 19:00, Robert Haas wrote:
> On Tue, Jul 8, 2014 at 12:06 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
>> On 8 Červenec 2014, 16:16, Robert Haas wrote:
>>
>>> Right, I think that's clear. I'm just pointing out that you get
>>> to decide: you can either start with a larger NTUP_PER_BUCKET and
>>> then reduce it if you enough memory left, or you can start with a
>>> smaller NTUP_PER_BUCKET and then increase it if you run short of
>>> memory.
>>
>> I don't think those two approaches are equal.
>>
>> With the approach I took, I can use a compromise value (NTUP=4) 
>> initially, and only resize the hash table once at the end (while
>> keeping the amount of memory under work_mem all the time).
>>
>> With the "NTUP=1 and increase in case of memory pressure" you have 
>> to shrink the table immediately (to fit into work_mem), and if the 
>> hash table gets split into multiple batches you're suddenly in a
>> situation with lower memory pressure and you may need to increase
>> it again.
> 
> True. On the other hand, this really only comes into play when the 
> estimates are wrong. If you know at the start how many tuples you're 
> going to need to store and how big they are, you determine whether 
> NTUP_PER_BUCKET=1 is going to work before you even start building
> the hash table. If it isn't, then you use fewer buckets right from
> the start. If we start by estimating a small value for
> NTUP_PER_BUCKET and then let it float upward if we turn out to have
> more tuples than expected, we're optimizing for the case where our
> statistics are right. If we start by estimating a larger value for
> NTUP_PER_BUCKET than what we think we need to fit within work_mem,
> we're basically guessing that our statistics are more likely to be
> wrong than to be right. I think.

Good point. The fist patch was targetted exactly at the wrongly
estimated queries. This patch attempts to apply the rehash to all plans,
and maybe there's a better way.

If the estimates are correct / not too off, we can use this information
to do the sizing 'right' at the beginning (without facing rehashes later).

Over-estimates are not a problem, because it won't make the hash table
slower (it'll be sized for more tuples) and we can't change the number
of batches anyway.

With under-estimates we have to decide whether to resize the hash or
increase the number of batches.

In both cases that matter (correct estimates and under-estimates) we
have to decide whether to increase the number of buckets or batches. I'm
not sure how to do that.

>> I wonder if this is really worth the effort - my guess is it's 
>> efficient only if large portion of buckets is not visited (and
>> thus does not need to be split) at all. Not sure how common that is
>> (our workloads certainly are not like that).
> 
> Yeah. It may be a bad idea. I threw it out there as a possible way of
> trying to mitigate the worst case, which is when you trouble to build
> the hash table and then make very few probes. But that may not be
> worth the complexity that this would introduce.

Let's keep it simple for now. I think the sizing question (explained
above) is more important and needs to be solved first.

regards
Tomas

pgsql-hackers by date:

From: Christoph Berg
Date: 08 July 2014, 18:21:58
Subject: Re: Securing "make check" (CVE-2014-0067)

From: Christoph Berg
Date: 08 July 2014, 19:53:12
Subject: Re: [GSoC2014] Patch ALTER TABLE ... SET LOGGED

Re: tweaking NTUP_PER_BUCKET - Mailing list pgsql-hackers

Previous

Next