Re: Re: BIT/BIT VARYING status - Mailing list pgsql-hackers
From | Adriaan Joubert |
---|---|
Subject | Re: Re: BIT/BIT VARYING status |
Date | |
Msg-id | 3A05AC69.5066EE5D@albourne.com Whole thread Raw |
In response to | Re: Re: BIT/BIT VARYING status (Peter Eisentraut <peter_e@gmx.net>) |
Responses |
Re: Re: BIT/BIT VARYING status
Re: Re: BIT/BIT VARYING status |
List | pgsql-hackers |
Peter, I've looked at the current implementation of the bit types and still have some doubts concerning the following issues: 1. Constants. The current behaviour just seems somewhat strange, and I have no idea where to fix it. test=# select B'1001'; ?column? ---------- X9 (1 row) test=# select B'1001'::bit; ERROR: Cannot cast this expression to type 'bit' test=# select B'1001'::varbit; ERROR: Cannot cast this expression to type 'varbit' test=# select 'B1001'::varbit; ?column? ---------- B1001 (1 row) test=# select 'B1001'::bit; ?column? ---------- X9 (1 row) test=# select X'1001'::varbit; ERROR: varbit_in: The bit string 4097 must start with B or X test=# select 'X1001'::varbit; ?column? ------------------- B0001000000000001 (1 row) test=# select 'X1001'::bit; ?column? ---------- X1001 (1 row) test=# select X'1001'::bit; ERROR: zpbit_in: The bit string 4097 must start with B or X Also, I have two output routines, that have been renames to zpbit_out and varbit_out. In fact, both will work just fine for bot bit and varbit, but the first prints as hex and the second as a bit string. Printing as hex is more compact, so good for long strings, but printing as a bit string is much more intuitive. One solution would be to make them both print to a bit string by default and define a function to generate a hex string. Another would be to have this under control of a variable. Most people who contacted me about bit strings seemed to want to use them for flags, so I guess the default should be to print them as a bit string. More for my information, if a user does not know about varbit, how does he cast to bit varying? 2. This is not a problem, more a question. There is no default way to compare bit to varbit, as in test=# select 'b10'::bit='b10'::varbit; ERROR: Unable to identify an operator '=' for types 'bit' and 'varbit' You will have to retype this query using an explicit cast This may be a good thing, as the comparison does depend on the lenght of the bit strings. 3. The ^ operator seems to attempt to coerce the arguments to float8? select 'B110011'::bit ^ 'B011101'::bit; ERROR: Function 'float8(bit)' does not exist Unable to identify a function that satisfies the given argument types You may need to add explicit typecasts 4. This is a policy question. When I use the bit shift operator, this always shifts within the current string only. So if I do select ('B010'::bit(6) >> 2)::varbit; ?column? ----------- B000100 I get what I would expect. But if I have a bit varying(6) field (in a table, this is just an example), I only get select ('B010'::varbit >> 2)::varbit; ?column? ----------- B000 which I find counter-intuitive. I have thus added 'zpshiftright' and 'varbitshiftright' functions. The second extends the bitstring to the right, while the first is the old bitshiftright function. I find this more intuitive at least. Question is what a shift left function should do? Should I shorten the string in the case of a shift left, to keep it symmetrical to shift right? This seems a pure policy decision, as there are arguments for both behaviours, although I am a great fan of symmetry. Let me know and I can implement a separate function. I have made a start on a file for regression tests, which I append with the diffs for the varbit files. Please let me know what else is needed and where I can help. Thanks! Adriaan-- -- BIT types -- -- -- Build tables for testing -- CREATE TABLE ZPBIT_TABLE(b BIT(11)); INSERT INTO ZPBIT_TABLE VALUES ('B'); INSERT INTO ZPBIT_TABLE VALUES ('B0'); INSERT INTO ZPBIT_TABLE VALUES ('B010101'); INSERT INTO ZPBIT_TABLE VALUES ('B01010101010'); INSERT INTO ZPBIT_TABLE VALUES ('B010101010101'); INSERT INTO ZPBIT_TABLE VALUES ('X554'); INSERT INTO ZPBIT_TABLE VALUES ('X555'); SELECT * FROM ZPBIT_TABLE; CREATE TABLE VARBIT_TABLE(v BIT VARYING(11)); INSERT INTO VARBIT_TABLE VALUES ('B'); INSERT INTO VARBIT_TABLE VALUES ('B0'); INSERT INTO VARBIT_TABLE VALUES ('B010101'); INSERT INTO VARBIT_TABLE VALUES ('B01010101010'); INSERT INTO VARBIT_TABLE VALUES ('B010101010101'); INSERT INTO VARBIT_TABLE VALUES ('X554'); INSERT INTO VARBIT_TABLE VALUES ('X555'); SELECT * FROM VARBIT_TABLE; -- Delete from tables DROP TABLE ZPBIT_TABLE; CREATE TABLE ZPBIT_TABLE(b BIT(16)); INSERT INTO ZPBIT_TABLE VALUES ('B11011'); INSERT INTO ZPBIT_TABLE SELECT b>>1 FROM ZPBIT_TABLE; INSERT INTO ZPBIT_TABLE SELECT b>>2 FROM ZPBIT_TABLE; INSERT INTO ZPBIT_TABLE SELECT b>>4 FROM ZPBIT_TABLE; INSERT INTO ZPBIT_TABLE SELECT b>>8 FROM ZPBIT_TABLE; SELECT POSITION('B1101'::bit IN b) as pos, POSITION('B11011'::bit IN b) as pos, b FROM ZPBIT_TABLE ; DROP TABLE VARBIT_TABLE; CREATE TABLE VARBIT_TABLE(v BIT VARYING(19)); INSERT INTO VARBIT_TABLE VALUES ('B11011'); INSERT INTO VARBIT_TABLE SELECT v>>1 FROM VARBIT_TABLE; INSERT INTO VARBIT_TABLE SELECT v>>2 FROM VARBIT_TABLE; INSERT INTO VARBIT_TABLE SELECT v>>4 FROM VARBIT_TABLE; INSERT INTO VARBIT_TABLE SELECT v>>8 FROM VARBIT_TABLE; SELECT POSITION('B1101'::bit IN v) as pos, POSITION('B11011'::bit IN v) as pos, v FROM VARBIT_TABLE ; -- Concatenation SELECT v, b::varbit AS B, (v || b)::varbit AS C FROM ZPBIT_TABLE, VARBIT_TABLE WHERE v::bit(16)=b ORDER BY C; -- Length SELECT b, length(b) AS lb FROM ZPBIT_TABLE; SELECT v, length(v) AS lv FROM VARBIT_TABLE; -- Substring SELECT b::varbit, SUBSTRING(b FROM 2 FOR 4)::varbit AS sub1, SUBSTRING(b FROM 7 FOR 13)::varbit AS sub2, SUBSTRING(b FROM 6)::varbit AS sub3 FROM ZPBIT_TABLE; SELECT v, SUBSTRING(v FROM 2 FOR 4)::varbit AS sub1, SUBSTRING(v FROM 7 FOR 13)::varbit AS sub2, SUBSTRING(v FROM 6)::varbit AS sub3 FROM VARBIT_TABLE; -- Drop the tables DROP TABLE ZPBIT_TABLE; DROP TABLE VARBIT_TABLE; --- Bit operations CREATE TABLE varbit_table (a BIT VARYING(16), b BIT VARYING(16)); COPY varbit_table FROM stdin; X0F X10 X1F X11 X2F X12 X3F X13 X8F X04 X000F X0010 X0123 XFFFF X2468 X2468 XFA50 X05AF X1234 XFFF5 \. SELECT a,b,~a AS "~ a",a & b AS "a & b", a|b AS "a | b", a^b AS "a ^ b" FROM varbit_table; SELECT a,b,a<b AS "a<b",a<=b AS "a<=b",a=b AS "a=b", a>=b AS "a>=b",a>b AS "a>b",a<=>b AS "a<=>b" FROM varbit_table; SELECT a,a<<4 AS "a<<4",b,b>>2 AS "b>>2" FROM varbit_table; DROP TABLE varbit_table; --- Bit operations CREATE TABLE zpbit (a BIT(16), b BIT(16)); COPY zpbit FROM stdin; X0F X10 X1F X11 X2F X12 X3F X13 X8F X04 X000F X0010 X0123 XFFFF X2468 X2468 XFA50 X05AF X1234 XFFF5 \. SELECT a,b,~a AS "~ a",a & b AS "a & b", a|b AS "a | b", a^b AS "a ^ b" FROM zpbit; SELECT a,b,a<b AS "a<b",a<=b AS "a<=b",a=b AS "a=b", a>=b AS "a>=b",a>b AS "a>b",a<>b AS "a<>b" FROM zpbit; SELECT a,a<<4 AS "a<<4",b,b>>2 AS "b>>2" FROM zpbit; DROP TABLE zpbit; -- The following should fail select 'X123'::bit & 'X12'::bit; select 'B0111'::bit | 'B011'::bit; select 'X023'::bit ^ 'B011101'::bit; -- More position tests, checking all the boundary cases SELECT POSITION('B1010'::bit IN 'B0000101'::bit); -- 0 SELECT POSITION('B1010'::bit IN 'B00001010'::bit); -- 5 SELECT POSITION('B1010'::bit IN 'B00000101'::bit); -- 0 SELECT POSITION('B1010'::bit IN 'B000001010'::bit); -- 6 SELECT POSITION('B'::bit IN 'B00001010'::bit); -- 1 SELECT POSITION('B0'::bit IN 'B'::bit); -- 0 SELECT POSITION('B'::bit IN 'B'::bit); -- 0 SELECT POSITION('B101101'::bit IN 'B001011011011011000'::bit); -- 3 SELECT POSITION('B10110110'::bit IN 'B001011011011010'::bit); -- 3 SELECT POSITION('B1011011011011'::bit IN 'B001011011011011'::bit); -- 3 SELECT POSITION('B1011011011011'::bit IN 'B00001011011011011'::bit); -- 5 SELECT POSITION('B11101011'::bit IN 'B11101011'::bit); -- 1 SELECT POSITION('B11101011'::bit IN 'B011101011'::bit); -- 2 SELECT POSITION('B11101011'::bit IN 'B00011101011'::bit); -- 4 SELECT POSITION('B11101011'::bit IN 'B0000011101011'::bit); -- 6 SELECT POSITION('B111010110'::bit IN 'B111010110'::bit); -- 1 SELECT POSITION('B111010110'::bit IN 'B0111010110'::bit); -- 2 SELECT POSITION('B111010110'::bit IN 'B000111010110'::bit); -- 4 SELECT POSITION('B111010110'::bit IN 'B00000111010110'::bit); -- 6 SELECT POSITION('B111010110'::bit IN 'B11101011'::bit); -- 0 SELECT POSITION('B111010110'::bit IN 'B011101011'::bit); -- 0 SELECT POSITION('B111010110'::bit IN 'B00011101011'::bit); -- 0 SELECT POSITION('B111010110'::bit IN 'B0000011101011'::bit); -- 0 SELECT POSITION('B111010110'::bit IN 'B111010110'::bit); -- 1 SELECT POSITION('B111010110'::bit IN 'B0111010110'::bit); -- 2 SELECT POSITION('B111010110'::bit IN 'B000111010110'::bit); -- 4 SELECT POSITION('B111010110'::bit IN 'B00000111010110'::bit); -- 6 SELECT POSITION('B111010110'::bit IN 'B000001110101111101011'::bit); -- 0 SELECT POSITION('B111010110'::bit IN 'B0000001110101111101011'::bit); -- 0 SELECT POSITION('B111010110'::bit IN 'B000000001110101111101011'::bit); -- 0 SELECT POSITION('B111010110'::bit IN 'B00000000001110101111101011'::bit); -- 0 SELECT POSITION('B111010110'::bit IN 'B0000011101011111010110'::bit); -- 14 SELECT POSITION('B111010110'::bit IN 'B00000011101011111010110'::bit); -- 15 SELECT POSITION('B111010110'::bit IN 'B0000000011101011111010110'::bit); -- 17 SELECT POSITION('B111010110'::bit IN 'B000000000011101011111010110'::bit); -- 19 SELECT POSITION('B000000000011101011111010110'::bit IN 'B000000000011101011111010110'::bit); -- 1 SELECT POSITION('B00000000011101011111010110'::bit IN 'B000000000011101011111010110'::bit); -- 2 SELECT POSITION('B0000000000011101011111010110'::bit IN 'B000000000011101011111010110'::bit); -- 0 *** src/backend/utils/adt/varbit.c.old Sun Nov 5 12:03:04 2000 --- src/backend/utils/adt/varbit.c Sun Nov 5 19:12:10 2000 *************** *** 73,79 **** bit_not_hex = false; else { ! elog(ERROR, "zpbit_in: %s is not a valid bitstring", s); bit_not_hex = false; /* keep compiler quiet */ } --- 73,79 ---- bit_not_hex = false; else { ! elog(ERROR, "zpbit_in: The bit string %s must start with B or X", s); bit_not_hex = false; /* keep compiler quiet */ } *************** *** 299,305 **** bit_not_hex = false; else { ! elog(ERROR, "varbit_in: %s is not a valid bitstring", s); bit_not_hex = false; /* keep compiler quiet */ } --- 299,305 ---- bit_not_hex = false; else { ! elog(ERROR, "varbit_in: The bit string %s must start with B or X", s); bit_not_hex = false; /* keep compiler quiet */ } *************** *** 744,749 **** --- 744,752 ---- *ps; bitlen = VARBITLEN(arg); + /* Do we have an upper bound? */ + if (l==-1) + l = bitlen; e = s + l; s1 = Max(s, 1); e1 = Min(e, bitlen + 1); *************** *** 1039,1045 **** * do a right shift (i.e. towards the end of the string) */ Datum ! bitshiftright(PG_FUNCTION_ARGS) { VarBit *arg = PG_GETARG_VARBIT_P(0); int32 shft = PG_GETARG_INT32(1); --- 1042,1048 ---- * do a right shift (i.e. towards the end of the string) */ Datum ! zpbitshiftright(PG_FUNCTION_ARGS) { VarBit *arg = PG_GETARG_VARBIT_P(0); int32 shft = PG_GETARG_INT32(1); *************** *** 1097,1102 **** --- 1100,1166 ---- PG_RETURN_VARBIT_P(result); } + /* bitshiftright + * do a right shift (i.e. towards the end of the string) + */ + Datum + /*varbitshiftright(PG_FUNCTION_ARGS)*/ + bitshiftright(PG_FUNCTION_ARGS) + { + VarBit *arg = PG_GETARG_VARBIT_P(0); + int32 shft = PG_GETARG_INT32(1); + VarBit *result; + int byte_shift, + byte_len, + ishift, + len; + bits8 *p, + *r; + + /* Negative shift is a shift to the left */ + if (shft < 0) + PG_RETURN_DATUM(DirectFunctionCall2(bitshiftleft, + VarBitPGetDatum(arg), + Int32GetDatum(-shft))); + + /* When we have a varying bit string, the string may get longer */ + len = VARBITLEN(arg) + shft; + byte_len = VARBITTOTALLEN(len); + + result = (VarBit *) palloc(byte_len); + VARATT_SIZEP(result) = byte_len; + VARBITLEN(result) = len; + r = VARBITS(result); + + byte_shift = shft / BITS_PER_BYTE; + ishift = shft % BITS_PER_BYTE; + p = VARBITS(arg); + + /* Set the first part of the result to 0 */ + memset(r, 0, byte_shift); + r += byte_shift; + + if (ishift == 0) + { + /* Special case: we can do a memcpy */ + len = VARBITBYTES(arg); + memcpy(r, p, len); + } + else + { + if (r < VARBITEND(result)) + *r = 0; /* initialize first byte */ + for (; r < VARBITEND(result); p++) + { + *r |= *p >> ishift; + if ((++r) < VARBITEND(result)) + *r = (*p << (BITS_PER_BYTE - ishift)) & BITMASK; + } + } + + PG_RETURN_VARBIT_P(result); + } + /* This is not defined in any standard. We retain the natural ordering of * bits here, as it just seems more intuitive. */ *************** *** 1216,1224 **** p++; if (p == VARBITEND(arg)) { mask2 = end_mask << (BITS_PER_BYTE - is); ! is_match = mask2 == 0; ! elog(NOTICE,"S. %d %d em=%2x sm=%2x r=%d", ! i,is,end_mask,mask2,is_match); break; } cmp = *s << (BITS_PER_BYTE - is); --- 1280,1286 ---- p++; if (p == VARBITEND(arg)) { mask2 = end_mask << (BITS_PER_BYTE - is); ! is_match = (mask2 == 0); break; } cmp = *s << (BITS_PER_BYTE - is); *** src/include/utils/varbit.h.old Sun Nov 5 20:20:40 2000 --- src/include/utils/varbit.h Sun Nov 5 19:04:16 2000 *************** *** 80,86 **** --- 80,88 ---- extern Datum bitxor(PG_FUNCTION_ARGS); extern Datum bitnot(PG_FUNCTION_ARGS); extern Datum bitshiftleft(PG_FUNCTION_ARGS); + extern Datum zpbitshiftright(PG_FUNCTION_ARGS); extern Datum bitshiftright(PG_FUNCTION_ARGS); + /*extern Datum varbitshiftright(PG_FUNCTION_ARGS);*/ extern Datum bitcat(PG_FUNCTION_ARGS); extern Datum bitsubstr(PG_FUNCTION_ARGS); extern Datum bitlength(PG_FUNCTION_ARGS);
pgsql-hackers by date: