Last round (I think) of FE/BE protocol changes - Mailing list pgsql-interfaces
From | Tom Lane |
---|---|
Subject | Last round (I think) of FE/BE protocol changes |
Date | |
Msg-id | 27444.1052321652@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: Last round (I think) of FE/BE protocol changes
|
List | pgsql-interfaces |
Okay, based on the recent discussions, here are concrete proposals for the last few adjustments to the 3.0 FE/BE protocol. Let's use int16 (2-byte integers) as format selector codes; this seems a reasonable compromise between bandwidth and flexibility. As of 7.4 the only supported values will be 0 = text and 1 = binary, but future versions can add more codes. In client-sent messages, format codes appear primarily in the Bind message. Bind needs to be able to specify two sets of formats: one for the parameters it is supplying, and one for the query result columns if any. I propose representing each set as a count N followed by N format codes. If the count is zero, then all the columns have the default format (which will always be 0 = text in 7.4, though we might later allow it to be set to something else). If the count is one, then the single format code is applied to all columns. Otherwise the count must match the number of parameters or output columns. (Note that this moves the output format request from Execute to Bind, so that formats can't be changed from one row to the next in a portal's result. This allows more server-side optimization of formatting routine setup.) FunctionCall likewise needs to specify the format codes for the data it is supplying and the result to be returned. In server-sent messages, format codes will be added to RowDescription messages, one per column. (A RowDescription sent in response to statement Describe will show the default zero format code for all columns. A RowDescription sent in response to portal Describe or simple Query will show the actual format codes in use for the result.) The CopyInResponse and CopyOutResponse messages will be changed to include a column count and per-column format codes. (Currently, the per-column codes will all be the same: all zero for plain COPY and all one for binary COPY. But someday we might extend COPY to do something different.) We will move to a single uniform representation of data items at the protocol level: an int4 byte count (not including self) followed by that many data bytes. NULL is represented by byte count -1 (and no data bytes, of course). The interpretation of the data bytes depends on the format code. This will be used in DataRow output, Bind parameters, FunctionCall, and FunctionResultResponse messages (the separate representation of FunctionVoidResponse goes away). This will also become the data representation in COPY BINARY files. I will change the header signature for COPY BINARY so that the files can't be mistaken for old-style server-internal-representation binary files. The BinaryRow message type goes away; DataRow will serve for all format codes. The content of DataRow will be a field count N followed by N fields in the above representation. Note that the null bitmap goes away. This representation is a little bulkier than the old one for rows containing many NULLs, but the same or smaller when there are no NULLs. It has a major advantage over the old representation in that the field contents can be extracted without any external knowledge --- in the old layout, if you didn't know the number of fields in advance, you were completely lost. libpq, for example, cannot support receiving Execute results without a preceding Describe result unless it can parse DataRow without knowing the number of columns in advance. Any objections? regards, tom lane
pgsql-interfaces by date: