Re: Transparent column encryption - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Transparent column encryption |
Date | |
Msg-id | CA+Tgmoaukq-Ho=8AMp6tSS5py=05gm1KNkRm6Le9akjO-4Q+zQ@mail.gmail.com Whole thread Raw |
In response to | Re: Transparent column encryption (Peter Eisentraut <peter@eisentraut.org>) |
Responses |
Re: Transparent column encryption
|
List | pgsql-hackers |
On Wed, Apr 10, 2024 at 6:13 AM Peter Eisentraut <peter@eisentraut.org> wrote: > Obviously, it's early days, so there will be plenty of time to have > discussions on various other aspects of this patch. I'm keeping a keen > eye on the discussion of protocol extensions, for example. I think the way that you handled that is clever, and along the lines of what I had in mind when I invented the _pq_ stuff. More specifically, the way that the ColumnEncryptionKey and ColumnMasterKey messages are handled is exactly the way that I was imagining things would work. The client uses _pq_.column_encryption to signal that it can understand those messages, and the server responds by including them. I assume that if the client doesn't signal understanding, then the server simply omits sending those messages. (I have not checked the code.) I'm less certain about the changes to the ParameterDescription and RowDescription messages. I see a couple of potential problems. One is that, if you say you can understand column encryption messages, the extra fields are included even for unencrypted columns. The client must choose at connection startup whether it ever wishes to read any encrypted data; if so, it pays a portion of that overhead all the time. Another potential problem is with the scalability of this design. Suppose that we could not only encrypt columns, but also compress, fold, mutilate, and spindle them. Then there might end up being a dizzying array of variation in the format of what is supposed to be the same message. Perhaps it's not so bad: as long as the documentation is clear about in which order the additional fields will appear in the relevant messages when more than one relevant feature is used, it's probably not too difficult for clients to cope. And it is probably also true that the precise size of, say, a RowDescription message will rarely be performance-critical. But another thought is that we might try to redesign this so that we simply add more message types rather than mutating message types i.e. after sending the RowDescription message, if any columns are encrypted, we additionally send a RowEncryptionDescription message. Then this treatment becomes symmetric with the handling of ColumnEncryptionKey and ColumnMasterKey messages, and there's no overhead when the feature is unused. With regard to the Bind message, I suggest that we regard the protocol change as reserving a currently-unused bit in the message to indicate whether the value is pre-encrypted, without reference to the protocol extension. It could be legal for a client that can't understand encryption message from the server to supply an encrypted value to be inserted into a column. And I don't think we would ever want the bit that's being reserved here to be used by some other extension for some other purpose, even when this extension isn't used. So I don't see a need for this to be tied into the protocol extension. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: