[PoC] Federated Authn/z with OAUTHBEARER - Mailing list pgsql-hackers
From | Jacob Champion |
---|---|
Subject | [PoC] Federated Authn/z with OAUTHBEARER |
Date | |
Msg-id | d1b467a78e0e36ed85a09adf979d04cf124a9d4b.camel@vmware.com Whole thread Raw |
Responses |
Re: [PoC] Federated Authn/z with OAUTHBEARER
Re: [PoC] Federated Authn/z with OAUTHBEARER |
List | pgsql-hackers |
Hi all, We've been working on ways to expand the list of third-party auth methods that Postgres provides. Some example use cases might be "I want to let anyone with a Google account read this table" or "let anyone who belongs to this GitHub organization connect as a superuser". Attached is a proof of concept that implements pieces of OAuth 2.0 federated authorization, via the OAUTHBEARER SASL mechanism from RFC 7628 [1]. Currently, only Linux is supported due to some ugly hacks in the backend. The architecture can support the following use cases, as long as your OAuth issuer of choice implements the necessary specs, and you know how to write a validator for your issuer's bearer tokens: - Authentication only, where an external validator uses the bearer token to determine the end user's identity, and Postgres decides whether that user ID is authorized to connect via the standard pg_ident user mapping. - Authorization only, where the validator uses the bearer token to determine the allowed roles for the end user, and then checks to make sure that the connection's role is one of those. This bypasses pg_ident and allows pseudonymous connections, where Postgres doesn't care who you are as long as the token proves you're allowed to assume the role you want. - A combination, where the validator provides both an authn_id (for later audits of database access) and an authorization decision based on the bearer token and role provided. It looks kinda like this during use: $ psql 'host=example.org oauth_client_id=f02c6361-0635-...' Visit https://oauth.example.org/login and enter the code: FPQ2-M4BG = Quickstart = For anyone who likes building and seeing green tests ASAP. Prerequisite software: - iddawc v0.9.9 [2], library and dev headers, for client support - Python 3, for the test suite only (Some newer distributions have dev packages for iddawc, but mine did not.) Configure using --with-oauth (and, if you've installed iddawc into a non-standard location, be sure to use --with-includes and --with- libraries. Make sure either rpath or LD_LIBRARY_PATH will get you what you need). Install as usual. To run the test suite, make sure the contrib/authn_id extension is installed, then init and start your dev cluster. No other configuration is required; the test will do it for you. Switch to the src/test/python directory, point your PG* envvars to a superuser connection on the cluster (so that a "bare" psql will connect automatically), and run `make installcheck`. = Production Setup = (but don't use this in production, please) Actually setting up a "real" system requires knowing the specifics of your third-party issuer of choice. Your issuer MUST implement OpenID Discovery and the OAuth Device Authorization flow! Seriously, check this before spending a lot of time writing a validator against an issuer that can't actually talk to libpq. The broad strokes are as follows: 1. Register a new public client with your issuer to get an OAuth client ID for libpq. You'll use this as the oauth_client_id in the connection string. (If your issuer doesn't support public clients and gives you a client secret, you can use the oauth_client_secret connection parameter to provide that too.) The client you register must be able to use a device authorization flow; some issuers require additional setup for that. 2. Set up your HBA with the 'oauth' auth method, and set the 'issuer' and 'scope' options. 'issuer' is the base URL identifying your third- party issuer (for example, https://accounts.google.com), and 'scope' is the set of OAuth scopes that the client and server will need to authenticate and/or authorize the user (e.g. "openid email"). So a sample HBA line might look like host all all samehost oauth issuer="https://accounts.google.com" scope="openid email" 3. In postgresql.conf, set up an oauth_validator_command that's capable of verifying bearer tokens and implements the validator protocol. This is the hardest part. See below. = Design = On the client side, I've implemented the Device Authorization flow (RFC 8628, [3]). What this means in practice is that libpq reaches out to a third-party issuer (e.g. Google, Azure, etc.), identifies itself with a client ID, and requests permission to act on behalf of the end user. The issuer responds with a login URL and a one-time code, which libpq presents to the user using the notice hook. The end user then navigates to that URL, presents their code, authenticates to the issuer, and grants permission for libpq to retrieve a bearer token. libpq grabs a token and sends it to the server for verification. (The bearer token, in this setup, is essentially a plaintext password, and you must secure it like you would a plaintext password. The token has an expiration date and can be explicitly revoked, which makes it slightly better than a password, but this is still a step backwards from something like SCRAM with channel binding. There are ways to bind a bearer token to a client certificate [4], which would mitigate the risk of token theft -- but your issuer has to support that, and I haven't found much support in the wild.) The server side is where things get more difficult for the DBA. The OAUTHBEARER spec has this to say about the server side implementation: The server validates the response according to the specification for the OAuth Access Token Types used. And here's what the Bearer Token specification [5] says: This document does not specify the encoding or the contents of the token; hence, detailed recommendations about the means of guaranteeing token integrity protection are outside the scope of this document. It's the Wild West. Every issuer does their own thing in their own special way. Some don't really give you a way to introspect information about a bearer token at all, because they assume that the issuer of the token and the consumer of the token are essentially the same service. Some major players provide their own custom libraries, implemented in your-language-of-choice, to deal with their particular brand of magic. So I punted and added the oauth_validator_command GUC. A token validator command reads the bearer token from a file descriptor that's passed to it, then does whatever magic is necessary to validate that token and find out who owns it. Optionally, it can look at the role that's being connected and make sure that the token authorizes the user to actually use that role. Then it says yea or nay to Postgres, and optionally tells the server who the user is so that their ID can be logged and mapped through pg_ident. (See the commit message in 0005 for a full description of the protocol. The test suite also has two toy implementations that illustrate the protocol, but they provide zero security.) This is easily the worst part of the patch, not only because my implementation is a bad hack on OpenPipeStream(), but because it balances the security of the entire system on the shoulders of a DBA who does not have time to read umpteen OAuth specifications cover to cover. More thought and coding effort is needed here, but I didn't want to gold-plate a bad design. I'm not sure what alternatives there are within the rules laid out by OAUTHBEARER. And the system is _extremely_ flexible, in the way that only code that's maintained by somebody else can be. = Patchset Roadmap = The seven patches can be grouped into three: 1. Prep 0001 decouples the SASL code from the SCRAM implementation. 0002 makes it possible to use common/jsonapi from the frontend. 0003 lets the json_errdetail() result be freed, to avoid leaks. 2. OAUTHBEARER Implementation 0004 implements the client with libiddawc. 0005 implements server HBA support and oauth_validator_command. 3. Testing 0006 adds a simple test extension to retrieve the authn_id. 0007 adds the Python test suite I've been developing against. The first three patches are, hopefully, generally useful outside of this implementation, and I'll plan to register them in the next commitfest. The middle two patches are the "interesting" pieces, and I've split them into client and server for ease of understanding, though neither is particularly useful without the other. The last two patches grew out of a test suite that I originally built to be able to exercise NSS corner cases at the protocol/byte level. It was incredibly helpful during implementation of this new SASL mechanism, since I could write the client and server independently of each other and get high coverage of broken/malicious implementations. It's based on pytest and Construct, and the Python 3 requirement might turn some away, but I wanted to include it in case anyone else wanted to hack on the code. src/test/python/README explains more. = Thoughts/Reflections = ...in no particular order. I picked OAuth 2.0 as my first experiment in federated auth mostly because I was already familiar with pieces of it. I think SAML (via the SAML20 mechanism, RFC 6595) would be a good companion to this proof of concept, if there is general interest in federated deployments. I don't really like the OAUTHBEARER spec, but I'm not sure there's a better alternative. Everything is left as an exercise for the reader. It's not particularly extensible. Standard OAuth is built for authorization, not authentication, and from reading the RFC's history, it feels like it was a hack to just get something working. New standards like OpenID Connect have begun to fill in the gaps, but the SASL mechanisms have not kept up. (The OPENID20 mechanism is, to my understanding, unrelated/obsolete.) And support for helpful OIDC features seems to be spotty in the real world. The iddawc dependency for client-side OAuth was extremely helpful to develop this proof of concept quickly, but I don't think it would be an appropriate component to build a real feature on. It's extremely heavyweight -- it incorporates a huge stack of dependencies, including a logging framework and a web server, to implement features we would probably never use -- and it's fairly difficult to debug in practice. If a device authorization flow were the only thing that libpq needed to support natively, I think we should just depend on a widely used HTTP client, like libcurl or neon, and implement the minimum spec directly against the existing test suite. There are a huge number of other authorization flows besides Device Authorization; most would involve libpq automatically opening a web browser for you. I felt like that wasn't an appropriate thing for a library to do by default, especially when one of the most important clients is a command-line application. Perhaps there could be a hook for applications to be able to override the builtin flow and substitute their own. Since bearer tokens are essentially plaintext passwords, the relevant specs require the use of transport-level protection, and I think it'd be wise for the client to require TLS to be in place before performing the initial handshake or sending a token. Not every OAuth issuer is also an OpenID Discovery provider, so it's frustrating that OAUTHBEARER (which is purportedly an OAuth 2.0 feature) requires OIDD for real-world implementations. Perhaps we could hack around this with a data: URI or something. The client currently performs the OAuth login dance every single time a connection is made, but a proper OAuth client would cache its tokens to reuse later, and keep an eye on their expiration times. This would make daily use a little more like that of Kerberos, but we would have to design a way to create and secure a token cache on disk. If you've read this far, thank you for your interest, and I hope you enjoy playing with it! --Jacob [1] https://datatracker.ietf.org/doc/html/rfc7628 [2] https://github.com/babelouest/iddawc [3] https://datatracker.ietf.org/doc/html/rfc8628 [4] https://datatracker.ietf.org/doc/html/rfc8705 [5] https://datatracker.ietf.org/doc/html/rfc6750#section-5.2
Attachment
- 0001-auth-generalize-SASL-mechanisms.patch
- 0002-src-common-remove-logging-from-jsonapi-for-shlib.patch
- 0003-common-jsonapi-always-palloc-the-error-strings.patch
- 0004-libpq-add-OAUTHBEARER-SASL-mechanism.patch
- 0005-backend-add-OAUTHBEARER-SASL-mechanism.patch
- 0006-Add-a-very-simple-authn_id-extension.patch
- 0007-Add-pytest-suite-for-OAuth.patch
pgsql-hackers by date: