Re: A bug when use get_bit() function for a long bytea string - Mailing list pgsql-hackers
From | Ashutosh Bapat |
---|---|
Subject | Re: A bug when use get_bit() function for a long bytea string |
Date | |
Msg-id | CAG-ACPUbQwpCHhn_KkLAGEc08A91yZhARazj2BGjXvK+eFCVog@mail.gmail.com Whole thread Raw |
In response to | Re: A bug when use get_bit() function for a long bytea string ("movead.li@highgo.ca" <movead.li@highgo.ca>) |
Responses |
Re: A bug when use get_bit() function for a long bytea string
Re: A bug when use get_bit() function for a long bytea string |
List | pgsql-hackers |
Hello thanks for the detailed review,>I think the second argument indicates the bit position, which would be max bytea length * 8. If max bytea length covers whole int32, the second argument >needs to be wider i.e. int64.Yes, it makes sence and followed.
I think we need a similar change in byteaGetBit() and byteaSetBit() as well.
> Some more comments on the patch> struct pg_encoding
> {
>- unsigned (*encode_len) (const char *data, unsigned dlen);
>+ int64 (*encode_len) (const char *data, unsigned dlen);
> unsigned (*decode_len) (const char *data, unsigned dlen);> unsigned (*encode) (const char *data, unsigned dlen, char *res);> unsigned (*decode) (const char *data, unsigned dlen, char *res);
> Why not use return type of int64 for rest of the functions here as well?
> res = enc->encode(VARDATA_ANY(data), datalen, VARDATA(result));
> /* Make this FATAL 'cause we've trodden on memory ... */
>- if (res > resultlen)
>+ if ((int64)res > resultlen)
>
>if we change return type of all those functions to int64, we won't need this cast.I change the 'encode' function, it needs an int64 return type, but keep other
functions in 'pg_encoding', because I think it of no necessary reason.
Ok, let's leave it for a committer to decide.
>Right now we are using int64 because bytea can be 1GB, but what if we increase
>that limit tomorrow, will int64 be sufficient? That may be unlikely in the near
>future, but nevertheless a possibility. Should we then define a new datatype
>which resolves to int64 right now but really depends upon the bytea length. I
>am not suggesting that we have to do it right now, but may be something to
>think about.I decide to use int64 because if we really want to increase the limit, it should bethe same change with 'text', 'varchar' which have the same limit. So it may needa more general struct. Hence I give up the idea.
Hmm, Let's see what a committer says.
Some more review comments.
+ int64 res,resultlen;
We need those on separate lines, possibly.
+ byteNo = (int32)(n / BITS_PER_BYTE);
Does it hurt to have byteNo as int64 so as to avoid a cast. Otherwise, please
add a comment explaining the reason for the cast. The comment applies at other
places where this change appears.
- int len;
+ int64 len;
Why do we need this change?
int i;
We need those on separate lines, possibly.
+ byteNo = (int32)(n / BITS_PER_BYTE);
Does it hurt to have byteNo as int64 so as to avoid a cast. Otherwise, please
add a comment explaining the reason for the cast. The comment applies at other
places where this change appears.
- int len;
+ int64 len;
Why do we need this change?
int i;
>It might help to add a test where we could pass the second argument something
>greater than 1G. But it may be difficult to write such a test case.Add two test cases.
+
+select get_bit(
+ set_bit((repeat('Postgres', 512 * 1024 * 1024 / 8))::bytea, 1024 * 1024 * 1024 + 1, 0)
+ ,1024 * 1024 * 1024 + 1);
This bit position is still within int4.
postgres=# select pg_column_size(1024 * 1024 * 1024 + 1);
pg_column_size
----------------
4
(1 row)
You want something like
postgres=# select pg_column_size(512::bigint * 1024 * 1024 * 8);
pg_column_size
----------------
8
(1 row)
Best Wishes,
Ashutosh
pgsql-hackers by date: