Re: Use compiler intrinsics for bit ops in hash - Mailing list pgsql-hackers
From | David Fetter |
---|---|
Subject | Re: Use compiler intrinsics for bit ops in hash |
Date | |
Msg-id | 20200114220917.GG32763@fetter.org Whole thread Raw |
In response to | Re: Use compiler intrinsics for bit ops in hash (Jesse Zhang <sbjesse@gmail.com>) |
Responses |
Re: Use compiler intrinsics for bit ops in hash
Re: Use compiler intrinsics for bit ops in hash |
List | pgsql-hackers |
On Tue, Jan 14, 2020 at 12:21:41PM -0800, Jesse Zhang wrote: > Hi David, > > On Tue, Jan 14, 2020 at 9:36 AM David Fetter <david@fetter.org> wrote: > > > > Folks, > > > > The recent patch for distinct windowing aggregates contained a partial > > fix of the FIXME that didn't seem entirely right, so I extracted that > > part, changed it to use compiler intrinsics, and submit it here. > > The changes in hash AM and SIMPLEHASH do look like a net positive > improvement. My biggest cringe might be in pg_bitutils: Thanks for looking at this! > > diff --git a/src/include/port/pg_bitutils.h b/src/include/port/pg_bitutils.h > > index 498e532308..cc9338da25 100644 > > --- a/src/include/port/pg_bitutils.h > > +++ b/src/include/port/pg_bitutils.h > > @@ -145,4 +145,32 @@ pg_rotate_right32(uint32 word, int n) > > return (word >> n) | (word << (sizeof(word) * BITS_PER_BYTE - n)); > > } > > > > +/* ceil(lg2(num)) */ > > +static inline uint32 > > +ceil_log2_32(uint32 num) > > +{ > > + return pg_leftmost_one_pos32(num-1) + 1; > > +} > > + > > +static inline uint64 > > +ceil_log2_64(uint64 num) > > +{ > > + return pg_leftmost_one_pos64(num-1) + 1; > > +} > > + > > +/* calculate first power of 2 >= num > > + * per https://graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2 > > + * using BSR where available */ > > +static inline uint32 > > +next_power_of_2_32(uint32 num) > > +{ > > + return ((uint32) 1) << (pg_leftmost_one_pos32(num-1) + 1); > > +} > > + > > +static inline uint64 > > +next_power_of_2_64(uint64 num) > > +{ > > + return ((uint64) 1) << (pg_leftmost_one_pos64(num-1) + 1); > > +} > > + > > #endif /* PG_BITUTILS_H */ > > > > 1. Is ceil_log2_64 dead code? Let's call it nascent code. I suspect there are places it could go, if I look for them. Also, it seemed silly to have one without the other. > 2. The new utilities added here (ceil_log2_32 and company, > next_power_of_2_32 and company) all require num > 1, but don't clearly > Assert (or at the very least document) so. Assert()ed. > 3. A couple of the callers can actively pass in an argument of 1, e.g. > from _hash_spareindex in hashutil.c, while some other callers are iffy > at best (simplehash.h maybe?) What would you recommend be done about this? > 4. It seems like you *really* would like an operation like LZCNT in x86 > (first appearing in Haswell) that is well defined on zero input. ISTM > the alternatives are: > > a) Special case 1. That seems straightforward, but the branching cost > on a seemingly unlikely condition seems to be a lot of performance > loss > > b) Use architecture specific intrinsic (and possibly with CPUID > shenanigans) like __builtin_ia32_lzcnt_u64 on x86 and use the CLZ > intrinsic elsewhere. The CLZ GCC intrinsic seems to map to > instructions that are well defined on zero in most ISA's other than > x86, so maybe we can get away with special-casing x86? b) seems much more attractive. Is there some way to tilt the tools so that this happens? What should I be reading up on? Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Attachment
pgsql-hackers by date: