Re: tweaking NTUP_PER_BUCKET - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: tweaking NTUP_PER_BUCKET |
Date | |
Msg-id | 53BC3859.5080006@fuzzy.cz Whole thread Raw |
In response to | Re: tweaking NTUP_PER_BUCKET (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: tweaking NTUP_PER_BUCKET
|
List | pgsql-hackers |
On 8.7.2014 19:00, Robert Haas wrote: > On Tue, Jul 8, 2014 at 12:06 PM, Tomas Vondra <tv@fuzzy.cz> wrote: >> On 8 Červenec 2014, 16:16, Robert Haas wrote: >> >>> Right, I think that's clear. I'm just pointing out that you get >>> to decide: you can either start with a larger NTUP_PER_BUCKET and >>> then reduce it if you enough memory left, or you can start with a >>> smaller NTUP_PER_BUCKET and then increase it if you run short of >>> memory. >> >> I don't think those two approaches are equal. >> >> With the approach I took, I can use a compromise value (NTUP=4) >> initially, and only resize the hash table once at the end (while >> keeping the amount of memory under work_mem all the time). >> >> With the "NTUP=1 and increase in case of memory pressure" you have >> to shrink the table immediately (to fit into work_mem), and if the >> hash table gets split into multiple batches you're suddenly in a >> situation with lower memory pressure and you may need to increase >> it again. > > True. On the other hand, this really only comes into play when the > estimates are wrong. If you know at the start how many tuples you're > going to need to store and how big they are, you determine whether > NTUP_PER_BUCKET=1 is going to work before you even start building > the hash table. If it isn't, then you use fewer buckets right from > the start. If we start by estimating a small value for > NTUP_PER_BUCKET and then let it float upward if we turn out to have > more tuples than expected, we're optimizing for the case where our > statistics are right. If we start by estimating a larger value for > NTUP_PER_BUCKET than what we think we need to fit within work_mem, > we're basically guessing that our statistics are more likely to be > wrong than to be right. I think. Good point. The fist patch was targetted exactly at the wrongly estimated queries. This patch attempts to apply the rehash to all plans, and maybe there's a better way. If the estimates are correct / not too off, we can use this information to do the sizing 'right' at the beginning (without facing rehashes later). Over-estimates are not a problem, because it won't make the hash table slower (it'll be sized for more tuples) and we can't change the number of batches anyway. With under-estimates we have to decide whether to resize the hash or increase the number of batches. In both cases that matter (correct estimates and under-estimates) we have to decide whether to increase the number of buckets or batches. I'm not sure how to do that. >> I wonder if this is really worth the effort - my guess is it's >> efficient only if large portion of buckets is not visited (and >> thus does not need to be split) at all. Not sure how common that is >> (our workloads certainly are not like that). > > Yeah. It may be a bad idea. I threw it out there as a possible way of > trying to mitigate the worst case, which is when you trouble to build > the hash table and then make very few probes. But that may not be > worth the complexity that this would introduce. Let's keep it simple for now. I think the sizing question (explained above) is more important and needs to be solved first. regards Tomas
pgsql-hackers by date: