Thread: JDBC with PG 8.4 bytea character escaping wire protocol
Does the JDBC 4 driver for PG 8.4 escape each character of the bytea type, or does it use a binary protocol? From another post, Merlin Moncure wrote, "libpq supports a binary protocol mode which allows you to execute queries sending bytea without escaping." Is that true of the JDBC driver? I'm concerned about the data size expansion that would result for the wire protocol when I use setBytes(). Thanks, David
> Does the JDBC 4 driver for PG 8.4 escape each character of the bytea > type, or does it use a binary protocol? > > From another post, Merlin Moncure wrote, "libpq supports a binary > protocol mode which allows you to execute queries sending bytea > without escaping." Is that true of the JDBC driver? I'm concerned > about the data size expansion that would result for the wire protocol > when I use setBytes(). I tried downloading the JDBC source and must say it was complex enough to a newbie that I couldn't really determine the answer myself. Does anybody know if the JDBC driver with PG 8.4 communicates using a binary protocol for BYTEA transfers to/from or whether it sends escaped chars only? Or perhaps which classes handle the transfer of BYTEA to PG and receives BYTEA responses from PG? Thanks, David
On Thu, Oct 22, 2009 at 8:59 AM, David Wall <d.wall@computer.org> wrote: > Does anybody > know if the JDBC driver with PG 8.4 communicates using a binary protocol for > BYTEA transfers to/from or whether it sends escaped chars only? Or perhaps > which classes handle the transfer of BYTEA to PG and receives BYTEA > responses from PG? I don't know if you've already looked at this document (I'm not sure if it answers your question): http://jdbc.postgresql.org/documentation/84/binary-data.html It sounds like it uses an InputStream but I'm not sure. -- Regards, Richard Broersma Jr. Visit the Los Angeles PostgreSQL Users Group (LAPUG) http://pugs.postgresql.org/lapug
> I don't know if you've already looked at this document (I'm not sure > if it answers your question): > > http://jdbc.postgresql.org/documentation/84/binary-data.html > > It sounds like it uses an InputStream but I'm not sure. > Thanks, Richard. Yeah, I'm familiar with how to use them, but my question is in regard to the wire transfer protocol between JDBC and the PG backend. David
On Thu, Oct 22, 2009 at 9:46 AM, David Wall <d.wall@computer.org> wrote: > Thanks, Richard. Yeah, I'm familiar with how to use them, but my question is > in regard to the wire transfer protocol between JDBC and the PG backend. I'm guessing binary. http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/jdbc/pgjdbc/org/postgresql/largeobject/BlobInputStream.java?annotate=1.12 http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/jdbc/pgjdbc/org/postgresql/largeobject/BlobInputStream.java?annotate=1.12 -- Regards, Richard Broersma Jr. Visit the Los Angeles PostgreSQL Users Group (LAPUG) http://pugs.postgresql.org/lapug
I guess I can not really answer your question, since I have not dug around in the code, but I can not understand why the JDBC would escape a byte[] sent or received using the setBytes() or getBytes() API or the stream methods. These are binary protocols, which can house any integer value within the limitations. The bytea type is is defined as binary which means it does not necessarily represent a character. Now the answer I guess could be answered in two ways. 1. Someone here or yourself would have to know or dig through the code to see if some conversion takes place. 2. Make a test case that sends or receives the data using setBytes()/getBytes() and one of the APIs that does escape characters or you manually escape the characters. Use a packet sniffer to check the package size in both cases. If they are the same then a conversion is taking place. I would suspect the Bytes() method would result in less bandwidth, not more, no data size expansion. danap >> Does the JDBC 4 driver for PG 8.4 escape each character of the bytea >> type, or does it use a binary protocol? >> >> From another post, Merlin Moncure wrote, "libpq supports a binary >> protocol mode which allows you to execute queries sending bytea >> without escaping." Is that true of the JDBC driver? I'm concerned >> about the data size expansion that would result for the wire protocol >> when I use setBytes(). > > > I tried downloading the JDBC source and must say it was complex enough > to a newbie that I couldn't really determine the answer myself. > Does anybody know if the JDBC driver with PG 8.4 communicates using a > binary protocol for BYTEA transfers to/from or whether it sends > escaped chars only? Or perhaps which classes handle the transfer of > BYTEA to PG and receives BYTEA responses from PG? > > Thanks, > David
David Wall wrote: > Does the JDBC 4 driver for PG 8.4 escape each character of the bytea > type, or does it use a binary protocol? When sending parameter data to the server, it uses the binary format. When receiving results from the server, it uses the text format. -O
Does the JDBC 4 driver for PG 8.4 escape each character of the bytea type, or does it use a binary protocol?
When sending parameter data to the server, it uses the binary format.
When receiving results from the server, it uses the text format.
Thanks, Oliver. Well, I hadn't even thought it could be both. <smile>
There was confusion in the other replies because they were looking at how the data moves from my code into the driver, but not how the driver communicates with the backend. It was clear I could store binary data as I was doing that just fine, but the PG docs made it sound like much of my binary data would end up expanding because more than half of the range an octet may have requires escaping per http://www.postgresql.org/docs/8.4/static/datatype-binary.html:
Table 8-7. bytea Literal Escaped Octets
Decimal Octet Value | Description | Escaped Input Representation | Example | Output Representation |
---|---|---|---|---|
0 | zero octet | E'\\000' | SELECT E'\\000'::bytea; | \000 |
39 | single quote | '''' or E'\\047' | SELECT E'\''::bytea; | ' |
92 | backslash | E'\\\\' or E'\\134' | SELECT E'\\\\'::bytea; | \\ |
0 to 31 and 127 to 255 | "non-printable" octets | E'\\xxx' (octal value) | SELECT E'\\001'::bytea; | \001 |
I am thinking this should not be a real issue for me since I will be using LOs for by big objects and BYTEAs for the smaller objects.
Thanks again,
David