Re: Internal key management system - Mailing list pgsql-hackers
From | Stephen Frost |
---|---|
Subject | Re: Internal key management system |
Date | |
Msg-id | 20201028172224.GF16415@tamriel.snowman.net Whole thread Raw |
In response to | Re: Internal key management system (Craig Ringer <craig.ringer@enterprisedb.com>) |
Responses |
Re: Internal key management system
|
List | pgsql-hackers |
Greetings, * Craig Ringer (craig.ringer@enterprisedb.com) wrote: > On Mon, Oct 26, 2020 at 11:02 PM Stephen Frost <sfrost@snowman.net> wrote: > > TL;DR: > > * Important to check that key rotation is possible on a replica, i.e. > primary and standby can have different cluster passphrase and KEK > encrypting the same WAL and heap keys; I agree that key rotation would certainly be good to have. > * with a HSM we can't read the key out, so a pluggable KEK operations > context or a configurable URI for the KEK is necessary There's a lot of options around HSMs, the Linux crypto API, potential different encryption libraries, et al. One thing that I'm not sure we're being clear enough on here is when we're talking about a KEK (key encryption key) vs. when we're talking about actually off-loading all of the encryption to an HSM or to an OpenSSL engine (which might in turn use the Linux crypto API...), etc. Agreed that, with some HSMs, we aren't able to actually pull out the key. Depending on the HSM, it may or may not be able to perform encryption and decryption with any kind of speed and therefore we should have options which don't require that. This would be the typical case where we'd have a KEK which encrypts a key we have stored and then that key is what's actually used for the encryption/decryption of the data. > * I want the SQL key and SQL wrap/unwrap part in a separate patch, I > don't think it's fully baked and oppose its inclusion in its current > form I'm generally a fan of having something at the SQL level, but I agree that it doesn't need to be part of this initial capability and could be done later as a separate patch. > Most importantly - I don't think the SQL key adds anything really > crucial that we cannot do at the SQL level with an extension. An > extension "pg_wrap" could provide pg_wrap() and pg_unwrap() already, > using a single master key much like the SQL key proposed in this > patch. To store the master key it could: Lots of things can be done in extensions but, at least for my part, I'd much rather see us build in an SQL key capability (with things like grammar support and being able to tie to to a role cleanly) than to try and figure out how to make this work as an extension. > That way we haven't baked some sort of limited wrap/unwrap into Pg's > long term user visible API. I'd be totally happy for such a SQL key > wrap/unwrap to become part of pgcrypto, or a separate extension that > uses pgcrypto, if you're worried about having it available to users. I > just don't really want it in src/backend in its current form. There's no shortage of interfaces that exist in other database systems for this that we can look at to help guide us in coming up with a good API here. All that said, we can debate that on another thread and independently of this discussion around TDE. > OTHER TRANSPARENT ENCRYPTION USE CASES > ---- > > Does this patch get in the way of supporting other kinds of > transparent encryption that are frequently requested and are in use on > other systems already? > > I don't think so. Whole-cluster encryption is quite separate and the > proposed patch doesn't seem to do anything that'd make table-, row- or > column-level encryption, per-user key management, etc any harder. > > Specific use cases I looked at: > > * Finer grained keying than whole-cluster for transparent > encryption-at-rest. As soon as we have relations that require user > session supplied information to allow the backend to read the relation > we get into a real mess with autovacuum, logical decoding, etc. So if > anyone wants to implement that sorts of thing they're probably going > to want to do so separately to block-level whole-cluster encryption, > in a way that preserves the normal page and page item structure and > encrypts the row data only. I tend to agree with this. > * Client-driver-assisted transparently encrypted > at-rest-and-in-transit data, where the database engine doesn't have > the encrypt/decrypt keys at all. Again in this case they're going to > have to do that at the row level or column level, not the block > (relfilenode extents and WAL) level, otherwise we can't provide > autovacuum etc. +100 to having client-driver-assisted encryption, this solves real attack vectors which traditional TDE simply doesn't, compared to filesystem or block device level encryption (even though lots of people seem to think it does, which is bizarre to me). > > That > > said- I don't think we necessarily want to throw out tho command-based > > option, as users may wish to use a vaulting solution or similar instead > > of an HSM. > > I agree. I wasn't proposing to throw out the command based approach, > just provide a way to inform postgres that it should do operations > with the KEK using an external engine instead of deriving its own KEK > from a passphrase and other inputs. I would think we'd want to enable admins to be able to control if what is being provided is a KEK (where the key is then decrypted by PG and PG then uses whatever libraries it's built with to perform the encryption and decryption in PG process space), or an engine/offloading configuration (where PG doesn't ever see the actual key and all encryption and decryption is done outside of PG's control by an HSM or the Linux kernel through the crypto API or whatever). The use-cases I'm thinking about: - User has a Yubikey, but would like PG to be able to write more than one block at a time. In this case, the Yubikey would have a KEK which PG doesn't ever see. PG would have an encrypted blob that it then asks the yubikey to decrypt which contains the actual key that's then kept in PG's memory to perform the encryption/decryption. Naturally, if that key is stolen then an attacker could decrypt the entire database, even if they don't have the yubikey. An attacker could acquire that key by having sufficient access on the PG sever to be able to read PG's memory. - User has a Thales Luna PCIe HSM, or similar. In this case, the user wants *all* of the encryption/decryption happening on the HSM and none of it happening in PG space, making it impossible for an attacker to acquire the actual key. - User has a yubikey, similar to #1, but would like to have the Linux kernel used to safe-guard the actual key used. This is a bit of an in-between area between the first case above and the second- specifically, a yubikey could have the KEK but then the actual data encryption key isn't given to PG, it's put into the Linux kernel's keyring and PG uses (perhaps through OpenSSL) the Linux crypto API to off-load the actual encryption and decryption to have that happening outside of PG's process space. This would make it much more difficult for an attacker to acquire the key if they only have control over PG or the postgres unix account, since the Linux kernel would prevent access to it, but it wouldn't require a HSM crypto accelerator. Of course, should an attacker gain root or direct physical access to the system somehow, they might be able to acquire the actual data encryption key that way. - User has a vaulting solution, and perhaps wants to store the actual encryption/decryption key there, or perhaps the user wants to store a passphrase in the vault and have PG derive the actual key from that. Either seems like it could be reasonable. - User hasn't got anything special and just wants to keep it simple by using a passphrase that's entered when PG is started up. > > What I am curious about though- what are the thoughts around > > using a vaulting solution's command-line tool vs. writing code to work > > with an API? > > I think the code that fetches the cluster passphrase from a command > should be interceptable by a hook, so organisations with Wacky > Security Policies Written By People Who Have Heard About Computers But > Never Used One can jump through the necessary hoops. I am of course > absolutely not speaking from experience here, no, not at all... see > ssl_passphrase_function in src/backend/libpq/be-secure-openssl.c, and > see src/test/modules/ssl_passphrase_callback/ssl_passphrase_func.c . > > So I suggest something like that - a hook that by default calls an > external command but can by overridden by an extension. It wouldn't be > openssl specific like the server key passphrase example though. That > was done with an openssl specific hook because we don't know if we're > going to need a passphrase at all until openssl has opened the key. In > the cluster encryption case we'll know if we're doing our own KEK+HMAC > generation or not without having to ask the SSL library. What I'm wondering about here is if we should make it an explicit option for a user to pick through the server configuration about if they're giving PG a direct key to use, a KEK that's actually meant to decrypt the data key, a way to fetch the direct key or the KEK, or a engine which has the KEK to ask to decrypt the data key, etc. If we can come up with a way to configure PG that will support the different use cases outlined above without being overly complicated, that'd be great. I'm not sure that I see that in what you've proposed here, but maybe by going through each of the use-cases and showing how a user would configure PG for each with this proposal, I will. > > Between these various options, what are the risks of > > having a script vs. using an API and would one or the other weaken the > > overall solution? Or is what's really needed here is a way to tell us > > if it's a passphrase we're getting or a proper key, regardless of the > > method being used to fetch it? > > For various vault systems I don't think it matters at all whether the > secret they manage is the key, input used to generate the key, or > input used to decrypt a key stored elsewhere. Either way they have the > pivotal secret. So I don't see much point allowing the command to > return a fully formed key. I hadn't really considered that to be a distinction either, so I'm glad that it sounds like we agreed on that point. > The point of a HSM that you don't get to read the key. Pg can never > read the key, it can only perform encrypt and decrypt operations on > the key using the HSM via the SSL library: This really depends on exactly what "key" is being referred to here, and where the encryption/decryption is happening. Hopefully the above use cases help clarify. > Pg -> openssl: > "this is the ciphertext of the wal_key. Please decrypt it for me." > openssl -> engine layer > "engine, please decrypt this" > pkcs#11 engine-> pkcs#11 provider: > "please decrypt this" > pkcs#11 provider -> HSM-specific libraries, network proxies, whatever: > "please decrypt this" > "... here's the plaintext" > <- flows back up Right- in this case, ultimately, the actual key used for the encryption and decryption ends up in PG's memory space as plaintext and could therefore be acquired by an attacker with access to PG memory space. > So the KEK used to encrypt the main cluster keys for heap and wal > encryption is never readable by Pg. It usually never enters host > memory - in the case of a HSM, the ciphertext is sent over USB or PCIe > to the HSM and the cleartext comes back. Agreed, the KEK isn't, but that isn't actually all that interesting since the KEK isn't needed to decrypt the data. > In openssl, the default engine is file-based with host software crypto > implementations. You can specify alternate engines using various > OpenSSL APIs, or you can specify them by supplying a URI where you'd > usually supply a file path to a key. Right. > I'm proposing we make it easy to supply a key URI and let openssl > handle the engine etc. It's far from perfect, and it's really meant as > a fallback to allow apps that don't natively understand SSL engines > etc to still use them in a limited capacity. I agree that it doesn't seem like a bad approach to expose that URI, but I'm not sure that's really the end of it since there's going to be cases where people would like to have a KEK on a yubikey and there'll be other cases where people would like to offload all of the encryption and decryption to a HSM crypto accelerator and, ideally, we'd allow them to be able to configure PG for either of those cases. > What I'd *prefer* to do is make the function that sets up the KEK > hookable. So by default we'd call a function that'd read the external > passphrase from a command use that to generate KEK+HMAC. But an > extension hook installed at shared_preload_libraries time could > override the behaviour completely and return its own implementation. I don't see a problem with adding hooks, where they make sense, but we should also make things work in a sensible way and a way that works with at least the use-cases that I've outlined, ideally, without having to go get an extension or write C code. > > This really locks us into OpenSSL for this, which I don't particularly > > like. > > We're pretty locked into openssl already. I don't like it either, it > was just the option that has the least impact/delay on the main work > on this patch. There's an active patch that's been worked on for quite some time that's getting some renewed interest in adding NSS support, something I certainly support also, so we really shouldn't be taking steps that end up making it more difficult to support alternatives. Perhaps a generic 'key URI' type of option wouldn't be too bad, and each library we support could parse that string out based on what information it needs (eg: for NSS, a database + key nickname could be provided in some specific format), but overall we certainly shouldn't be baking things in which are very OpenSSL-specific and exposed to users. > I'd rather abstract KEK operations behind a context object-like struct > with function pointer members, like we do in many other places in Pg. > Make the default one do the dance of reading the external passphrase > and generating the KEK on the fly. Allow plugins to override it with > their own, and let them set it up to delegate to a HSM or whatever > else they want. > > Then ship a simple openssl based default implementation of HSM support > that can be shoved in shared_preload_libraries. Or if we don't like > using s_p_l, add a separate GUC for cluster_encryption_key_manager or > whatever, and a different entrypoint, instead of having s_p_l call > _PG_init() to register a hook. I definitely think we want to support things directly in PG and not require an extension or something to be in s_p_l for this. > > > For example if I want to lock my database with a YubiHSM I would configure > > > something like: > > > > > > cluster_encryption_key = 'pkcs11:token=YubiHSM;id=0:0001;type=private' > > > > > > The DB would be encrypted and decrypted using application keys unlocked by > > > the HSM. Backups of the database, stolen disk images, etc, would be > > > unreadable unless you have access to another HSM with the same key loaded. > > > > Well, you would surely just need the key, since you could change the PG > > config to fetch the key from wherever you have it, you wouldn't need an > > actual HSM. > > Right - if your HSM was programmed by generating a key and storing > that into the HSM and you have that key backed up in file form > somewhere, you could likely put it in a pem file and use that directly > by pointing Pg at the file instead of an engine URI. Sure. > But you might not even have the key. In some HSM implementations the > key is completely sealed - you can program new HSMs to have the same > key by using the same configuration, but you cannot actually obtain > the key short of attacks on the HSM hardware itself. That's very much > by design - the HSM configuration is usually on an air-gapped system, > and it isn't sufficient to decrypt anything unless you also have > access to a copy of the HSM hardware itself. Obviously you accept the > risks if you take that approach, and you must have an escape route > where you can re-encrypt the material protected by the HSM against > some other key. But it's not at all uncommon. Right, but in such cases you'd need an HSM that's able to perform encryption and decryption at some reasonable rate. > Key rotation is obviously vital to make this vaguely sane. In Pg's > case you'd to change the key configuration, then trigger a key > rotation step, which would decrypt with a context obtained from the > old config then encrypt with a context obtained from the new config. Yes, key rotation is an important part. > > > If cluster_encryption_key is unset, Pg would perform its own KEK derivation > > > based on cluster_passphrase_command as currently implemented. > > > > To what I was suggesting above- what if we just had a GUC that's > > "kek_method" with options 'passphrase' and 'direct', where passphrase > > goes through KEK and 'direct' doesn't, which just changes how we treat > > the results of called cluster_passphrase_command? > > That won't work for a HSM. It is not possible to extract the key. > "direct" cannot be implemented. Perhaps the above helps explain what I was getting at there. Thanks, Stephen
Attachment
pgsql-hackers by date: