Re: Macro customizable hashtable / bitmapscan & aggregation perf - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Macro customizable hashtable / bitmapscan & aggregation perf |
Date | |
Msg-id | ec2f8c1d-329b-e7b6-1406-f74af789c1f0@2ndquadrant.com Whole thread Raw |
In response to | Re: Macro customizable hashtable / bitmapscan & aggregation perf (Andres Freund <andres@anarazel.de>) |
Responses |
Re: Macro customizable hashtable / bitmapscan &
aggregation perf
Re: Macro customizable hashtable / bitmapscan & aggregation perf |
List | pgsql-hackers |
On 10/02/2016 02:17 AM, Andres Freund wrote: > Hi,> ... > >>>> I think a crucial part of the benchmarking will be identifying and >>>> measuring corner cases, e.g. a lot of rows with the same key, etc. >>>> Although that probably is not a major issue for the two places >>>> switched to the new implementation (e.g. execGrouping merges the >>>> duplicates to a single group, by definition). >>> >>> Hm. I don't really see a case where that's going to cause issues - all >>> the execGrouping.c users only store one key, and either ignore later >>> ones, or add the data to the initial tuple in the group. I don't really >>> see any case where repeated keys should cause an issue for a hash table? >>> >> >> Yep, that's pretty much what I suggested. I was wondering more about the >> other places where this hash table might be used - I've been thinking about >> hashjoins, but now that I think of it - that's a fairly specialized and >> tuned code. In any case, let's not complicate the discussion for now. > > My point is that I don't really see any potential usecase where > that's an issue. > OK, I think we agree the two places modified by the submitted patch series are fine. Let's leave discussion about places modified by future patches for the future ;-) > >> A few comments, after quickly skimming through the first two patches: >> >> 1) SH_CREATE does this: >> >> /* round up size to the next power of 2, eases lookups */ >> if (size < 2) >> size = 2; >> else >> size = sh_pow2_int(size); >> >> I very much doubt a hash table with 2 buckets is very useful. What about >> using some reasonable lower boundary - IIRC hashjoins use 1024 buckets, >> which might be a bit too much, but perhaps 256 would work? > > For some cases that'll waste a good bit of space. Consider > e.g. catcache.c. Callers can (and do) specify more appropriate sizes for > the individual uses. > Hmmm, OK. I have my doubts about those hardcoded constants, but you're right 256 is probably an overkill. > >> 5) SH_RESIZE >> >> Do we expect to shrink hash tables? > > Not at the moment. Should be doable, but I'm not sure it's worth > developing and testing - I don't see any usecases in pg atm. > OK, sure. Then what about adding this to SH_RESIZE? Assert(oldsize <= newsize); I assumed we might shrink the catcache after invalidation, for example (but maybe I don't really know how that works). > >> It's not entirely clear why is it guaranteed that there's always >> an element with optimal position (when load factor < 1)? Adding an >> explanation or a link to a paper would be helpful, I guess. > > As the load factor always has to be below 1, there always has to be > an empty bucket. Every bucket after an empty bucket has to be at > it's optimal position. > Hmmm, thanks to moving the entries after delete? Adding an assert() enforcing this seems like a good idea. > >> 7) Should hash agg size estimation for the planner consider the fillfactor? >> >> I think we should account for fillfactor - we should account for the >> allocated memory, even if there's free space in the hash table. After all, >> that's what hashjoins do. > > We didn't really for dynahash (as it basically assumed a fillfactor of 1 > all the time), not sure if changing it won't also cause issues. > That's a case of glass half-full vs. half-empty, I guess. If we assumed load factor 1.0, then I see it as accounting for load factor (while you see it as not accounting of it). I don't see why this should cause issues - of course, the hash table may be a bit larger, exceed work_mem and make the tidbitmap lossy, or switch HashAggregate to GroupAggregate. But I don't think that's a major issue, as those decisions depend on estimates anyway, so we can't really guarantee them. Also, with load factor 0.8 the table is only ~20% larger, so even if we don't include that into work_mem it's a reasonably small error (easily smaller than errors in the pre-9.5 HashJoin accounting, which did not include chunk headers etc.). So I won't fight for this, but I don't see why not to account for it. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: