Thread: [PATCH] btree_gist: fix union implementation for variable length columns
Hi hackers, while I tried to debug 'gcc -fstack-protector -O3' problems in [1], I noticed that gbt_var_union() mistreats the first vector element. Patch is attached. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1544349 Pavel
Attachment
Pavel Raiskup <praiskup@redhat.com> writes: > while I tried to debug 'gcc -fstack-protector -O3' problems in [1], I noticed > that gbt_var_union() mistreats the first vector element. Patch is attached. Hi Pavel! For patches that purport to resolve bugs, we usually like to add a regression test case that demonstrates the bug in unpatched code. Can you provide a small test case that does so? (The BZ you pointed to doesn't seem to address this...) regards, tom lane
Re: [PATCH] btree_gist: fix union implementation for variable length columns
From
Pavel Raiskup
Date:
Hi Tom, On Monday, July 9, 2018 7:41:59 PM CEST Tom Lane wrote: > Pavel Raiskup <praiskup@redhat.com> writes: > > while I tried to debug 'gcc -fstack-protector -O3' problems in [1], I noticed > > that gbt_var_union() mistreats the first vector element. Patch is attached. > > Hi Pavel! For patches that purport to resolve bugs, we usually like to > add a regression test case that demonstrates the bug in unpatched code. > Can you provide a small test case that does so? (The BZ you pointed to > doesn't seem to address this...) I've been looking at the reproducer for a while now... and I probably need a hint if that's necessary. Turns out the problem is only related to bit/bit varying type (so the patch comments need to be reworded properly, at least) since those are the only types which have implemented the f_l2n() callback. Considering testcase 'bit.sql' (bit(33) type), it's pretty clear that on little endian boxes the problematic strings would be '00100001*' (int32 value for '33', because there's varbit header). Big endian boxes would have problems with char[] like {0, 0, 0, 33, ...}. But I fail to construct right order of correctly picked elements into 'bit.data'. The problem probably is (a) that picksplit reorders the elements very often, and probably that (b) random() brings non-determinism into the final tree shape (if the reproducer was to be based on many equal keys in the index). It's amazing to see how the index fixes itself :-) for this bug. Can you help me with the reproducer idea, resp. can we live without the reproducer in this particular case? Pavel
Pavel Raiskup <praiskup@redhat.com> writes: > On Monday, July 9, 2018 7:41:59 PM CEST Tom Lane wrote: >> Hi Pavel! For patches that purport to resolve bugs, we usually like to >> add a regression test case that demonstrates the bug in unpatched code. >> Can you provide a small test case that does so? (The BZ you pointed to >> doesn't seem to address this...) > Turns out the problem is only related to bit/bit varying type (so the > patch comments need to be reworded properly, at least) since those are the > only types which have implemented the f_l2n() callback. What I'm failing to wrap my head around is why this code exists at all. AFAICS, gbt_bit_xfrm just forces the bitstring to be zero-padded out to an INTALIGN boundary, which it wouldn't necessarily be otherwise. But why bother? That should have no effect on the behavior of bit_cmp(). So I'm speculating that the reason nobody has noticed a problem is that there is no problem because this code is useless and could be ripped out. It would be easier to figure this out if the btree_gist code weren't so desperately undocumented. Teodor, do you remember why it's like this? regards, tom lane
Re: [PATCH] btree_gist: fix union implementation for variable length columns
From
Pavel Raiskup
Date:
On Wednesday, July 11, 2018 7:26:40 PM CEST Tom Lane wrote: > Pavel Raiskup <praiskup@redhat.com> writes: > > On Monday, July 9, 2018 7:41:59 PM CEST Tom Lane wrote: > >> Hi Pavel! For patches that purport to resolve bugs, we usually like to > >> add a regression test case that demonstrates the bug in unpatched code. > >> Can you provide a small test case that does so? (The BZ you pointed to > >> doesn't seem to address this...) > > > Turns out the problem is only related to bit/bit varying type (so the > > patch comments need to be reworded properly, at least) since those are the > > only types which have implemented the f_l2n() callback. > > What I'm failing to wrap my head around is why this code exists at all. > AFAICS, gbt_bit_xfrm just forces the bitstring to be zero-padded out to > an INTALIGN boundary, which it wouldn't necessarily be otherwise. > But why bother? That should have no effect on the behavior of bit_cmp(). The gbt_bit_xfrm() method actually also skips the varbit header (number of bits, stored in VarBit.bit_len), the magic is done by VARBITS macro. If the header is dropped, it's OK to just use existing byteacmp() for bit array comparison. Pavel
Re: [PATCH] btree_gist: fix union implementation for variable lengthcolumns
From
Teodor Sigaev
Date:
> It would be easier to figure this out if the btree_gist code weren't > so desperately undocumented. Teodor, do you remember why it's like > this? Will look. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Re: [PATCH] btree_gist: fix union implementation for variable length columns
From
Pavel Raiskup
Date:
On Thursday, July 12, 2018 5:53:18 PM CET Teodor Sigaev wrote: > > It would be easier to figure this out if the btree_gist code weren't > > so desperately undocumented. Teodor, do you remember why it's like > > this? > > Will look. Ping on this issue. I guess the patch I proposed isn't wrong in this case, I'm just not sure about the commit message. Pavel