Thread: Cache invalidation after authentication (on-the-fly role creation)
Hello hackers, I'd like to do this to postinit.c: PerformAuthentication(MyProcPort); + AcceptInvalidationMessages(); InitializeSessionUserId(username, useroid); Any objections? Motivation: Many people use custom scripts, ldap2pg or other similar tools to synchronise or manage their PostgreSQL roles from an LDAP server. Doing that in a cron job works pretty well, but you have to wait for the next cron job run after you make a change. Based on several requests from the field (basically, ex-SQL Server shops, where they love Active Directory group-based authorisation), I started wondering if it would be crazy to do incremental per-user synchronisation at at the moment of authentication. New hooks don't seem to be necessary, since PAM is designed for things like that -- for example creating your Unix home directory on demand. But I ran into one small problem: PostgreSQL doesn't see catalog changes that happened during authentication because of caching. For example, you might put this in pg_hba.conf: host all all all pam pamservice=postgresql ... and then put this in /etc/pam.d/postgresql: auth required pam_ldap.so config=/path/to/ldap.conf auth required pam_exec.so /path/to/ldap2pg --sync-pam-user account required pam_permit.so That's a fictional ldap2pg option that would synchronise only the user passed to it in $PAM_USER. A high performance version could do authentication and role synchronisation at the same time (perhaps using the LDAP server's change notification/counter to skip having to connect back to PostgreSQL to inspect catalogs in the common case that nothing changed). If you try things like that today, it works but any roles created during authentication are (sometimes?) not visible to PostgreSQL until your *next* attempt to log in. -- Thomas Munro http://www.enterprisedb.com
On 2018-Jul-04, Thomas Munro wrote: > Hello hackers, > > I'd like to do this to postinit.c: > > PerformAuthentication(MyProcPort); > + AcceptInvalidationMessages(); > InitializeSessionUserId(username, useroid); > > Any objections? Is there a measurable performance overhead to this change? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, On 2018-07-03 19:44:21 -0400, Alvaro Herrera wrote: > On 2018-Jul-04, Thomas Munro wrote: > > > Hello hackers, > > > > I'd like to do this to postinit.c: > > > > PerformAuthentication(MyProcPort); > > + AcceptInvalidationMessages(); > > InitializeSessionUserId(username, useroid); > > > > Any objections? > > Is there a measurable performance overhead to this change? I can't see it being relevant here. We accept inval message in plenty other places, and in comparison to session startup it should be just about inmeasurable. Greetings, Andres Freund
On Wed, Jul 4, 2018 at 12:10 PM, Andres Freund <andres@anarazel.de> wrote: > On 2018-07-03 19:44:21 -0400, Alvaro Herrera wrote: >> On 2018-Jul-04, Thomas Munro wrote: >> > PerformAuthentication(MyProcPort); >> > + AcceptInvalidationMessages(); >> > InitializeSessionUserId(username, useroid); >> > >> > Any objections? >> >> Is there a measurable performance overhead to this change? > > I can't see it being relevant here. We accept inval message in plenty > other places, and in comparison to session startup it should be just > about inmeasurable. Yeah, using "pgbench -c 8 -j 8 -T 60 --connect -S -M prepared postgres" I wasn't able to measure a significant difference on my laptop. The performance was equally terrible, at around 940 TPS +/- 10 including connection time. Adding to open commitfest. -- Thomas Munro http://www.enterprisedb.com
Attachment
On Wed, Jul 04, 2018 at 04:25:18PM +1200, Thomas Munro wrote: > Yeah, using "pgbench -c 8 -j 8 -T 60 --connect -S -M prepared > postgres" I wasn't able to measure a significant difference on my > laptop. The performance was equally terrible, at around 940 TPS +/- > 10 including connection time. Adding to open commitfest. I wanted to comment on that this morning but forgot as my mind was driven away by another problem. What if you used the Julien-Rouhaud's method of a custom script with only ";" used as query and -c? This won't run any queries, and will stress authentication. -- Michael
Attachment
On 2018-07-04 16:25:18 +1200, Thomas Munro wrote: > @@ -745,6 +746,7 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username, > /* normal multiuser case */ > Assert(MyProcPort != NULL); > PerformAuthentication(MyProcPort); > + AcceptInvalidationMessages(); > InitializeSessionUserId(username, useroid); > am_superuser = superuser(); FWIW, a comment explaining why it's done there seems appropriate. - Andres
On Wed, Jul 4, 2018 at 4:35 PM, Michael Paquier <michael@paquier.xyz> wrote: > I wanted to comment on that this morning but forgot as my mind was > driven away by another problem. What if you used the Julien-Rouhaud's > method of a custom script with only ";" used as query and -c? This > won't run any queries, and will stress authentication. Ok, I tried "pgbench -c 8 -j 8 -T 60 --connect -f empty.sql postgres" where empty.sql contains just ";", and the numbers were noisy but around 1160 TPS with or without a patch. On Wed, Jul 4, 2018 at 4:35 PM, Andres Freund <andres@anarazel.de> wrote: > On 2018-07-04 16:25:18 +1200, Thomas Munro wrote: >> PerformAuthentication(MyProcPort); >> + AcceptInvalidationMessages(); >> InitializeSessionUserId(username, useroid); > > FWIW, a comment explaining why it's done there seems appropriate. Done. -- Thomas Munro http://www.enterprisedb.com
Attachment
Thomas Munro <thomas.munro@enterprisedb.com> writes: > I'd like to do this to postinit.c: > PerformAuthentication(MyProcPort); > + AcceptInvalidationMessages(); > InitializeSessionUserId(username, useroid); > Any objections? That seems like a *really* ad-hoc place to put it. Why should it be there, and not (say) somewhere inside InitializeSessionUserId, or maybe (also?) inside PerformAuthentication? Why do the existing call sites for AcceptInvalidationMessages --- in particular, the ones associated with lock acquisition --- not solve the problem already? I'm not opposed to trying to make things better if we have a stale-cache problem during init, just dubious that this is how to do it. regards, tom lane
On Thu, Jul 5, 2018 at 9:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.munro@enterprisedb.com> writes: >> PerformAuthentication(MyProcPort); >> + AcceptInvalidationMessages(); >> InitializeSessionUserId(username, useroid); > > That seems like a *really* ad-hoc place to put it. Why should it be > there, and not (say) somewhere inside InitializeSessionUserId, or maybe > (also?) inside PerformAuthentication? Why do the existing call sites for > AcceptInvalidationMessages --- in particular, the ones associated with > lock acquisition --- not solve the problem already? There is no lock acquisition involved here. The sequence is: 1. ClientAuthentication()->hba_getauthmethod()->check_hba()->get_role_oid() tries to look up user "fred" and doesn't find it (that lookup is used for group membership checks for hba line matching purposes, and not finding it here is not fatal, assuming you match with "all"); the cache now has a negative entry. 2. The configured authentication method runs, and via a side connection it creates the role "fred". 3. InitializeSessionUserId() looks up "fred", and finds the stale negative entry. I suppose the call to AcceptInvalidationMessages() could go at the end of ClientAuthentication(). That'd be closer to the code that creates the negative entry and immediately after the code that might modify the catalogs. Or in PeformAuthentication(), its caller. Or in IntializeSessionUserId() immediately before we try to look up the name, but that's also called by background workers that don't need it. -- Thomas Munro http://www.enterprisedb.com
Greetings Thomas, * Thomas Munro (thomas.munro@enterprisedb.com) wrote: > Many people use custom scripts, ldap2pg or other similar tools to > synchronise or manage their PostgreSQL roles from an LDAP server. While I don't have any particular opinion on the proposed patch, I wanted to share a thought about a possibly better approach to this kind of integration with AD: What if we (optionally, of course) had an always-running background worker which connects to AD and streams down the changes happening by using AD's Change Notification system: https://docs.microsoft.com/en-us/windows/desktop/AD/change-notifications-in-active-directory-domain-services Then integrate those changes into PG as they come in, avoiding the need for a cronjob. Again, I'm not expressing an opinion for or against the change you propose, just mentioning another approach to the general problem. I can see some advantages to waiting until an actual connection attempt to go create the role (you don't end up with roles in the system which never actually log into it) and advantages to using a background worker (the role will already be created, avoiding possible delay during the authentication and setup work of the role; more clear what roles have access on the system; ability to GRANT access to roles which haven't logged in yet or to set other attributes on the role prior to login). Thanks! Stephen
Attachment
On Fri, Jul 6, 2018 at 1:50 AM, Stephen Frost <sfrost@snowman.net> wrote: > What if we (optionally, of course) had an always-running background > worker which connects to AD and streams down the changes happening by > using AD's Change Notification system: > > https://docs.microsoft.com/en-us/windows/desktop/AD/change-notifications-in-active-directory-domain-services > > Then integrate those changes into PG as they come in, avoiding the need > for a cronjob. Hi Stephen, Yeah, that's a good idea. There are several change tracking techniques available[1], some of which might also be useful for the just-in-time sync technique I'm talking about (a cheap way to know if anything interesting changed since last time at the same time as you authenticate, so you can avoid doing any work at all most of the time). Maybe other LDAP vendors can do this type of stuff too (it looks like there was an attempt to standardise it, but it appears to be resting[2]...) Another idea would be for someone to take a tool like ldap2pg and give it a --daemon mode where it does that, and then package it up with rc.d/systemd/whatever glue so it's easy to deploy. Another idea, somewhere between that and your idea, is to guess that the main reason you really want to put this stuff in a background worker is because you want a long running process and you can't be bothered with the daemon management for an external thing... so... we could make a generic background worker extension that'll run arbitrary external long running programs while the database it up, and then something like ldap2pg could have a --stream-changes mode that'd do it until asked to shut down. I'm assuming it'd be better to keep the actual messy synchronisation problem outside the core PostgreSQL project, where a thousand tools can bloom. We just need the right hooks and cache invalidation logic so they can, and we're pretty close. > Again, I'm not expressing an opinion for or against the change you > propose, just mentioning another approach to the general problem. I can > see some advantages to waiting until an actual connection attempt to go > create the role (you don't end up with roles in the system which never > actually log into it) and advantages to using a background worker (the > role will already be created, avoiding possible delay during the > authentication and setup work of the role; more clear what roles have > access on the system; ability to GRANT access to roles which haven't > logged in yet or to set other attributes on the role prior to login). Another advantage to asynchronous schemes is that you can use group role names in pg_hba.conf (whereas that is checked before the auth module is invoked, so won't work with my synchronous just-in-time scheme; you need to use "all" in that case, though there is an easy workaround using LDAP search_filter). A disadvantage of your specific scheme is that it doesn't exist yet and probably needs to be written in C, and my scheme (despite its disadvantages) probably only requires slinging a few lines of Python/fooscript/barscript around. [1] https://docs.microsoft.com/en-us/windows/desktop/AD/overview-of-change-tracking-techniques [2] https://datatracker.ietf.org/doc/draft-dawkins-ldapext-subnot/ -- Thomas Munro http://www.enterprisedb.com
Thomas Munro <thomas.munro@enterprisedb.com> writes: > On Thu, Jul 5, 2018 at 9:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> That seems like a *really* ad-hoc place to put it. Why should it be >> there, and not (say) somewhere inside InitializeSessionUserId, or maybe >> (also?) inside PerformAuthentication? Why do the existing call sites for >> AcceptInvalidationMessages --- in particular, the ones associated with >> lock acquisition --- not solve the problem already? > There is no lock acquisition involved here. The sequence is: > > 1. ClientAuthentication()->hba_getauthmethod()->check_hba()->get_role_oid() > tries to look up user "fred" and doesn't find it (that lookup is used > for group membership checks for hba line matching purposes, and not > finding it here is not fatal, assuming you match with "all"); the > cache now has a negative entry. > > 2. The configured authentication method runs, and via a side > connection it creates the role "fred". > > 3. InitializeSessionUserId() looks up "fred", and finds the stale > negative entry. [ thinks about that for awhile... ] So really this is an instance of a generic problem, which is that the result of a syscache lookup could be stale: it's not guaranteed to reflect changes that committed since the last AcceptInvalidationMessages call, which in the worst case is the start of the current transaction. We have workarounds in place that (mostly?) guarantee that relation-related lookups will get sufficiently up-to-date data, but there's nothing much protecting other catalogs such as pg_proc or pg_authid. I can't help thinking we're going to need to fix that eventually. But I can't think of any easy way to do so. Adding AcceptInvalidationMessages to SearchSysCache itself is right out for performance reasons, I'm afraid, unless we can somehow find a way to make the no-op path through it much cheaper than it is now. > I suppose the call to AcceptInvalidationMessages() could go at the end > of ClientAuthentication(). That'd be closer to the code that creates > the negative entry and immediately after the code that might modify > the catalogs. Or in PeformAuthentication(), its caller. Or in > IntializeSessionUserId() immediately before we try to look up the > name, but that's also called by background workers that don't need it. I think my preference would be to put it in InitializeSessionUserId so that it's clearly associated with the syscache lookups we're trying to protect. I don't entirely buy your argument that background workers don't need up-to-date info for this. regards, tom lane
On Fri, Jul 13, 2018 at 6:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.munro@enterprisedb.com> writes: >> I suppose the call to AcceptInvalidationMessages() could go at the end >> of ClientAuthentication(). That'd be closer to the code that creates >> the negative entry and immediately after the code that might modify >> the catalogs. Or in PeformAuthentication(), its caller. Or in >> IntializeSessionUserId() immediately before we try to look up the >> name, but that's also called by background workers that don't need it. > > I think my preference would be to put it in InitializeSessionUserId > so that it's clearly associated with the syscache lookups we're trying > to protect. Thanks. Pushed to master that way. With this change, I can successfully use a little PAM module that authenticates with an LDAP server and then creates, grants and revokes roles as necessary based on LDAP groups. It's not as good as Stephen's grand plan, but it's dead simple. Without the change, it still works, but the first login attempt for a given user fails. I can live with that in the back branches. > I don't entirely buy your argument that background workers > don't need up-to-date info for this. Yeah, a hypothetical background worker that starts up and then waits to be told the name of the role to use could suffer from this problem. -- Thomas Munro http://www.enterprisedb.com