Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) - Mailing list pgsql-hackers
From | Sehrope Sarkuni |
---|---|
Subject | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) |
Date | |
Msg-id | CAH7T-aoxAU5+cCZ=P4nzc-ZhGj_aqRWJWA-xrg1FT6e8ZwC65g@mail.gmail.com Whole thread Raw |
In response to | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) (cary huang <hcary328@gmail.com>) |
Responses |
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
|
List | pgsql-hackers |
Hi, I took a look at this patch. With some additions I think the feature itself is useful but the patch needs more work. It also doesn't have any of its own automated tests yet so the testing below was done manually. The attached file, kms_v2.patch, is a rebased version of the kms_v1.patch that fixes some bit rot. It sorts some of the Makefile additions but otherwise is the original patch. This version applies cleanly on master and passes make check. I don't have a Windows machine to test it, but I think the Windows build files for these changes are missing. The updated src/common/Makefile has a comment to coordinate updates to Mkvcbuild.pm but I don't see kmgr_utils.c or cipher_openssl.c referenced anywhere in there. The patch adds "pg_kmgr" to the list of files to skip in pg_checksums.c but there's no additional "pg_kmgr" file written to the data directory. Perhaps that's from a prior version that saved data to its own file? The constant AES128_KEY_LEN is defined in cipher.c but it's not used anywhere. RE: AES-128, not sure the value of even supporting it for this feature (v.s. just supporting AES-256). Unlike something like table data encryption, I'd expect a KMS to be used much less frequently so any performance boost of AES-128 vs AES-256 would be meaningless. The functions pg_cipher_encrypt(...), pg_cipher_decrypt(...), and pg_compute_HMAC(...) return true if OpenSSL is not configured. Should that be false? The ctx init functions all return false when not configured. I don't think that code path would ever be reached as you would not have a valid context but seems more consistent to have them all return false. There's a comment referring to "Encryption keys (TDEK and WDEK) length" but this feature is only for a KMS so that should be renamed. The passphrase is hashed to split it into two 32-byte keys but the min length is only 8-bytes: #define KMGR_MIN_PASSPHRASE_LEN 8 ... that should be at least 64-bytes to reflect how it's being used downstream. Depending on the format of the passphrase commands output it should be even longer (ex: binary data in hex should really be double that). The overall min should be 64-byte but maybe add a note to the docs to explain how the output will be used and the expected amount of entropy. In pg_kmgr_wrap(...) it checks that the input is a multiple of 8 bytes: if (datalen % 8 != 0) ereport(ERROR, (errmsg("input data must be multiple of 8 bytes"))); ...but after testing it, the OpenSSL key wrap functions it invokes require a multiple of 16-bytes (block size of AES). Otherwise you get a generic error: # SELECT pg_kmgr_wrap('abcd1234'::bytea); ERROR: could not wrap the given secret In ossl_compute_HMAC(...) it refers to AES256_KEY_LEN. Should be SHA256_HMAC_KEY_LEN (they're both 32-bytes but naming is wrong) return HMAC(EVP_sha256(), key, AES256_KEY_LEN, data, (uint32) data_size, result, (uint32 *) result_size); In pg_rotate_encryption_key(...) the error message for short passphrases should be "at least %d bytes": if (passlen < KMGR_MIN_PASSPHRASE_LEN) ereport(ERROR, (errmsg("passphrase must be more than %d bytes", KMGR_MIN_PASSPHRASE_LEN))); Rotating the passphrase via "SELECT pg_rotate_encryption_key()" and restarting the server worked (good). Having the server attempt to start with invalid output from the command gives an error "FATAL: cluster passphrase does not match expected passphrase" (good). Round tripping via wrap/unwrap works (good!): # SELECT convert_from(pg_kmgr_unwrap(pg_kmgr_wrap('abcd1234abcd1234'::bytea)), 'utf8'); convert_from ------------------ abcd1234abcd1234 (1 row) Trying to unwrap gibberish fails (also good!): # SELECT pg_kmgr_unwrap('\x123456789012345678901234567890123456789012345678'); ERROR: could not unwrap the given secret The pg_kmgr_wrap/unwrap functions use EVP_aes_256_wrap()[1] which implements RFC 5649[2] with the default IVs so they always return the same value for the same input: # SELECT x, pg_kmgr_wrap('abcd1234abcd1234abcd1234') FROM generate_series(1,5) x; x | pg_kmgr_wrap ---+-------------------------------------------------------------------- 1 | \x51041d1fe52916fd15f456c2b67108473d9bf536795e2b6d4db81c065c8cd688 2 | \x51041d1fe52916fd15f456c2b67108473d9bf536795e2b6d4db81c065c8cd688 3 | \x51041d1fe52916fd15f456c2b67108473d9bf536795e2b6d4db81c065c8cd688 4 | \x51041d1fe52916fd15f456c2b67108473d9bf536795e2b6d4db81c065c8cd688 5 | \x51041d1fe52916fd15f456c2b67108473d9bf536795e2b6d4db81c065c8cd688 (5 rows) The IVs should be randomized so that repeated wrap operations give distinct results. To do that, the output format needs to include the randomized IV. It need not be secret but it needs to be included in the wrapped output. Related, IIUC, the wrapping mechanism of RFC5649 does provide some integrity checking but it's only 64-bits (v.s. say 256-bits for a full HMAC-SHA-256). Rather than use EVP_aes_256_wrap() with its defaults, we can generate a random IV and have the output be "IV || ENCRYPT(KEY, IV, DATA) || HMAC(IV || ENCRYPT(KEY, IV, DATA))". For a fixed length internal input (ex: the KEK encrypted key stored in pg_control) there's no need for padding as we're dealing with multiples of 16-bytes (ex: KEK encrypted enc-key / mac-key would be 64-bytes). It'd also be useful if the user level wrap/unwrap API allowed for arbitrary sized inputs (not just multiples of 16-byte). Having the output be in a standard format (i.e. matching OpenSSL's EVP_aes_256_wrap API) is nice, but as it's meant to be an opaque interface I think it's fine if the output is not usable outside the database. I don't see anyone using the wrapped data directly as it's random bytes without the key. The primary contract for the interface: "data == unwrap(wrap(data))". This would require enabling padding which would round up the size of the output to the next 16-bytes. Adding a prefix byte for a "version" would be nice too as it could be used to infer the specific cipher/mac combo (Ex: v1 would be AES256/HMAC-SHA256). I don't think the added size is an issue as again, the output is opaque. Similar things can also be accomplished by combining the 16-byte only version with pgcrypto but like this it'd be usable out of the box without additional extensions. [1]: https://www.openssl.org/docs/man1.1.1/man3/EVP_aes_256_wrap.html [2]: https://tools.ietf.org/html/rfc5649 Regards, -- Sehrope Sarkuni Founder & CEO | JackDB, Inc. | https://www.jackdb.com/
Attachment
pgsql-hackers by date: