Re: [HACKERS] path toward faster partition pruning - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: [HACKERS] path toward faster partition pruning |
Date | |
Msg-id | CAKJS1f-1Lc_b=Y1iicPQzvUgSn1keHSgmRqLuOGq_VR6M==zbw@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] path toward faster partition pruning (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>) |
Responses |
Re: [Sender Address Forgery]Re: [HACKERS] path toward fasterpartition pruning
Re: [HACKERS] path toward faster partition pruning |
List | pgsql-hackers |
On 13 November 2017 at 22:46, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > On 2017/11/10 12:30, Kyotaro HORIGUCHI wrote: >> In 0002, bms_add_range has a bit naive-looking loop >> >> + while (wordnum <= uwordnum) >> + { >> + bitmapword mask = (bitmapword) ~0; >> + >> + /* If working on the lower word, zero out bits below 'lower'. */ >> + if (wordnum == lwordnum) >> + { >> + int lbitnum = BITNUM(lower); >> + mask >>= lbitnum; >> + mask <<= lbitnum; >> + } >> + >> + /* Likewise, if working on the upper word, zero bits above 'upper' */ >> + if (wordnum == uwordnum) >> + { >> + int ushiftbits = BITS_PER_BITMAPWORD - (BITNUM(upper) + 1); >> + mask <<= ushiftbits; >> + mask >>= ushiftbits; >> + } >> + >> + a->words[wordnum++] |= mask; >> + } >> >> Without some aggressive optimization, the loop takes most of the >> time to check-and-jump for nothing especially with many >> partitions and somewhat unintuitive. >> >> The following uses a bit tricky bitmap operation but >> is straightforward as a whole. >> >> ===== >> /* fill the bits upper from BITNUM(lower) (0-based) of the first word */ >> a->workds[wordnum++] += ~(bitmapword)((1 << BITNUM(lower)) - 1); >> >> /* fill up intermediate words */ >> while (wordnum < uwordnum) >> a->words[wordnum++] = ~(bitmapword) 0; >> >> /* fill up to BITNUM(upper) bit (0-based) of the last word */ >> a->workds[wordnum++] |= >> (~(bitmapword) 0) >> (BITS_PER_BITMAPWORD - (BITNUM(upper) - 1)); >> ===== > > Considering also the David's comment downthread, I will try to incorporate > this into bms_add_range(). I've attached an implementation of the patch using this method. I've also attached bitset.c which runs each through their paces. I'd have expected Kyotaro's method to be faster, but gcc 7.2 with -O2 generates very slightly slower code. I didn't really check why. clang seems to do a better job with it. $ gcc -O2 bitset.c -o bitset && ./bitset bms_add_range in 0.694254 (6.94254 ns per loop) bms_add_range2 in 0.726643 (7.26643 ns per loop) 11111111111111111111111111111110 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 ------------- 11111111111111111111111111111110 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 $ gcc --version gcc (Ubuntu 7.2.0-8ubuntu3) 7.2.0 Copyright (C) 2017 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ clang -O2 bitset.c -o bitset && ./bitset bms_add_range in 0.866554 (8.66554 ns per loop) bms_add_range2 in 0.467138 (4.67138 ns per loop) 11111111111111111111111111111110 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 ------------- 11111111111111111111111111111110 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 11111111111111111111111111111111 $ clang --version clang version 4.0.1-6 (tags/RELEASE_401/final) Target: x86_64-pc-linux-gnu Thread model: posix InstalledDir: /usr/bin Probably just go with Kyotaro's idea (v2). I don't think this is worth debating, I just wanted to show it's not that clear-cut. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
pgsql-hackers by date: