Thread: Cache invalidation after authentication (on-the-fly role creation)

Cache invalidation after authentication (on-the-fly role creation)

From

Thomas Munro

Date:

04 July 2018, 00:58:17

Hello hackers,

I'd like to do this to postinit.c:

                PerformAuthentication(MyProcPort);
+               AcceptInvalidationMessages();
                InitializeSessionUserId(username, useroid);

Any objections?  Motivation:

Many people use custom scripts, ldap2pg or other similar tools to
synchronise or manage their PostgreSQL roles from an LDAP server.
Doing that in a cron job works pretty well, but you have to wait for
the next cron job run after you make a change.  Based on several
requests from the field (basically, ex-SQL Server shops, where they
love Active Directory group-based authorisation), I started wondering
if it would be crazy to do incremental per-user synchronisation at at
the moment of authentication.  New hooks don't seem to be necessary,
since PAM is designed for things like that -- for example creating
your Unix home directory on demand.  But I ran into one small problem:
PostgreSQL doesn't see catalog changes that happened during
authentication because of caching.

For example, you might put this in pg_hba.conf:

host all all all pam pamservice=postgresql

... and then put this in /etc/pam.d/postgresql:

auth required pam_ldap.so config=/path/to/ldap.conf
auth required pam_exec.so /path/to/ldap2pg --sync-pam-user
account required pam_permit.so

That's a fictional ldap2pg option that would synchronise only the user
passed to it in $PAM_USER.  A high performance version could do
authentication and role synchronisation at the same time (perhaps
using the LDAP server's change notification/counter to skip having to
connect back to PostgreSQL to inspect catalogs in the common case that
nothing changed).

If you try things like that today, it works but any roles created
during authentication are (sometimes?) not visible to PostgreSQL until
your *next* attempt to log in.

-- 
Thomas Munro
http://www.enterprisedb.com

Re: Cache invalidation after authentication (on-the-fly rolecreation)

From

Alvaro Herrera

Date:

04 July 2018, 02:44:21

On 2018-Jul-04, Thomas Munro wrote:

> Hello hackers,
> 
> I'd like to do this to postinit.c:
> 
>                 PerformAuthentication(MyProcPort);
> +               AcceptInvalidationMessages();
>                 InitializeSessionUserId(username, useroid);
> 
> Any objections?

Is there a measurable performance overhead to this change?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Cache invalidation after authentication (on-the-fly rolecreation)

From

Andres Freund

Date:

04 July 2018, 03:10:32

Hi,

On 2018-07-03 19:44:21 -0400, Alvaro Herrera wrote:
> On 2018-Jul-04, Thomas Munro wrote:
> 
> > Hello hackers,
> > 
> > I'd like to do this to postinit.c:
> > 
> >                 PerformAuthentication(MyProcPort);
> > +               AcceptInvalidationMessages();
> >                 InitializeSessionUserId(username, useroid);
> > 
> > Any objections?
> 
> Is there a measurable performance overhead to this change?

I can't see it being relevant here. We accept inval message in plenty
other places, and in comparison to session startup it should be just
about inmeasurable.

Greetings,

Andres Freund

Re: Cache invalidation after authentication (on-the-fly role creation)

From

Thomas Munro

Date:

04 July 2018, 07:25:18

On Wed, Jul 4, 2018 at 12:10 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2018-07-03 19:44:21 -0400, Alvaro Herrera wrote:
>> On 2018-Jul-04, Thomas Munro wrote:
>> >                 PerformAuthentication(MyProcPort);
>> > +               AcceptInvalidationMessages();
>> >                 InitializeSessionUserId(username, useroid);
>> >
>> > Any objections?
>>
>> Is there a measurable performance overhead to this change?
>
> I can't see it being relevant here. We accept inval message in plenty
> other places, and in comparison to session startup it should be just
> about inmeasurable.

Yeah, using "pgbench -c 8 -j 8 -T 60 --connect -S -M prepared
postgres" I wasn't able to measure a significant difference on my
laptop.  The performance was equally terrible, at around 940 TPS +/-
10 including connection time.  Adding to open commitfest.

-- 
Thomas Munro
http://www.enterprisedb.com

Attachment

0001-Call-AcceptInvalidationMessages-after-authenticating.patch

Re: Cache invalidation after authentication (on-the-fly rolecreation)

From

Michael Paquier

Date:

04 July 2018, 07:35:23

On Wed, Jul 04, 2018 at 04:25:18PM +1200, Thomas Munro wrote:
> Yeah, using "pgbench -c 8 -j 8 -T 60 --connect -S -M prepared
> postgres" I wasn't able to measure a significant difference on my
> laptop.  The performance was equally terrible, at around 940 TPS +/-
> 10 including connection time.  Adding to open commitfest.

I wanted to comment on that this morning but forgot as my mind was
driven away by another problem.  What if you used the Julien-Rouhaud's
method of a custom script with only ";" used as query and -c?  This
won't run any queries, and will stress authentication.
--
Michael

Attachment

signature.asc

Re: Cache invalidation after authentication (on-the-fly rolecreation)

From

Andres Freund

Date:

04 July 2018, 07:35:57

On 2018-07-04 16:25:18 +1200, Thomas Munro wrote:
> @@ -745,6 +746,7 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
>          /* normal multiuser case */
>          Assert(MyProcPort != NULL);
>          PerformAuthentication(MyProcPort);
> +        AcceptInvalidationMessages();
>          InitializeSessionUserId(username, useroid);
>          am_superuser = superuser();

FWIW, a comment explaining why it's done there seems appropriate.

- Andres

Re: Cache invalidation after authentication (on-the-fly role creation)

From

Thomas Munro

Date:

04 July 2018, 12:37:41

On Wed, Jul 4, 2018 at 4:35 PM, Michael Paquier <michael@paquier.xyz> wrote:
> I wanted to comment on that this morning but forgot as my mind was
> driven away by another problem.  What if you used the Julien-Rouhaud's
> method of a custom script with only ";" used as query and -c?  This
> won't run any queries, and will stress authentication.

Ok, I tried "pgbench -c 8 -j 8 -T 60 --connect -f empty.sql postgres"
where empty.sql contains just ";", and the numbers were noisy but
around 1160 TPS with or without a patch.

On Wed, Jul 4, 2018 at 4:35 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2018-07-04 16:25:18 +1200, Thomas Munro wrote:
>>               PerformAuthentication(MyProcPort);
>> +             AcceptInvalidationMessages();
>>               InitializeSessionUserId(username, useroid);
>
> FWIW, a comment explaining why it's done there seems appropriate.

Done.

-- 
Thomas Munro
http://www.enterprisedb.com

Attachment

0001-Call-AcceptInvalidationMessages-after-authenticat-v2.patch

Re: Cache invalidation after authentication (on-the-fly role creation)

From

Tom Lane

Date:

05 July 2018, 00:35:05

Thomas Munro <thomas.munro@enterprisedb.com> writes:
> I'd like to do this to postinit.c:

>                 PerformAuthentication(MyProcPort);
> +               AcceptInvalidationMessages();
>                 InitializeSessionUserId(username, useroid);

> Any objections?

That seems like a *really* ad-hoc place to put it.  Why should it be
there, and not (say) somewhere inside InitializeSessionUserId, or maybe
(also?) inside PerformAuthentication?  Why do the existing call sites for
AcceptInvalidationMessages --- in particular, the ones associated with
lock acquisition --- not solve the problem already?

I'm not opposed to trying to make things better if we have a stale-cache
problem during init, just dubious that this is how to do it.

            regards, tom lane

Re: Cache invalidation after authentication (on-the-fly role creation)

From

Thomas Munro

Date:

05 July 2018, 16:07:04

On Thu, Jul 5, 2018 at 9:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@enterprisedb.com> writes:
>>                 PerformAuthentication(MyProcPort);
>> +               AcceptInvalidationMessages();
>>                 InitializeSessionUserId(username, useroid);
>
> That seems like a *really* ad-hoc place to put it.  Why should it be
> there, and not (say) somewhere inside InitializeSessionUserId, or maybe
> (also?) inside PerformAuthentication?  Why do the existing call sites for
> AcceptInvalidationMessages --- in particular, the ones associated with
> lock acquisition --- not solve the problem already?

There is no lock acquisition involved here.  The sequence is:

1.  ClientAuthentication()->hba_getauthmethod()->check_hba()->get_role_oid()
tries to look up user "fred" and doesn't find it (that lookup is used
for group membership checks for hba line matching purposes, and not
finding it here is not fatal, assuming you match with "all"); the
cache now has a negative entry.

2.  The configured authentication method runs, and via a side
connection it creates the role "fred".

3.  InitializeSessionUserId() looks up "fred", and finds the stale
negative entry.

I suppose the call to AcceptInvalidationMessages() could go at the end
of ClientAuthentication().  That'd be closer to the code that creates
the negative entry and immediately after the code that might modify
the catalogs.  Or in PeformAuthentication(), its caller.  Or in
IntializeSessionUserId() immediately before we try to look up the
name, but that's also called by background workers that don't need it.

-- 
Thomas Munro
http://www.enterprisedb.com

Re: Cache invalidation after authentication (on-the-fly rolecreation)

From

Stephen Frost

Date:

05 July 2018, 16:50:49

Greetings Thomas,

* Thomas Munro (thomas.munro@enterprisedb.com) wrote:
> Many people use custom scripts, ldap2pg or other similar tools to
> synchronise or manage their PostgreSQL roles from an LDAP server.

While I don't have any particular opinion on the proposed patch, I
wanted to share a thought about a possibly better approach to this kind
of integration with AD:

What if we (optionally, of course) had an always-running background
worker which connects to AD and streams down the changes happening by
using AD's Change Notification system:

https://docs.microsoft.com/en-us/windows/desktop/AD/change-notifications-in-active-directory-domain-services

Then integrate those changes into PG as they come in, avoiding the need
for a cronjob.

Again, I'm not expressing an opinion for or against the change you
propose, just mentioning another approach to the general problem.  I can
see some advantages to waiting until an actual connection attempt to go
create the role (you don't end up with roles in the system which never
actually log into it) and advantages to using a background worker (the
role will already be created, avoiding possible delay during the
authentication and setup work of the role; more clear what roles have
access on the system; ability to GRANT access to roles which haven't
logged in yet or to set other attributes on the role prior to login).

Thanks!

Stephen

Attachment

signature.asc

Re: Cache invalidation after authentication (on-the-fly role creation)

From

Thomas Munro

Date:

06 July 2018, 02:40:52

On Fri, Jul 6, 2018 at 1:50 AM, Stephen Frost <sfrost@snowman.net> wrote:
> What if we (optionally, of course) had an always-running background
> worker which connects to AD and streams down the changes happening by
> using AD's Change Notification system:
>
> https://docs.microsoft.com/en-us/windows/desktop/AD/change-notifications-in-active-directory-domain-services
>
> Then integrate those changes into PG as they come in, avoiding the need
> for a cronjob.

Hi Stephen,

Yeah, that's a good idea.  There are several change tracking
techniques available[1], some of which might also be useful for the
just-in-time sync technique I'm talking about (a cheap way to know if
anything interesting changed since last time at the same time as you
authenticate, so you can avoid doing any work at all most of the
time).  Maybe other LDAP vendors can do this type of stuff too (it
looks like there was an attempt to standardise it, but it appears to
be resting[2]...)

Another idea would be for someone to take a tool like ldap2pg and give
it a --daemon mode where it does that, and then package it up with
rc.d/systemd/whatever glue so it's easy to deploy.

Another idea, somewhere between that and your idea, is to guess that
the main reason you really want to put this stuff in a background
worker is because you want a long running process and you can't be
bothered with the daemon management for an external thing... so... we
could make a generic background worker extension that'll run arbitrary
external long running programs while the database it up, and then
something like ldap2pg could have a --stream-changes mode that'd do it
until asked to shut down.

I'm assuming it'd be better to keep the actual messy synchronisation
problem outside the core PostgreSQL project, where a thousand tools
can bloom.  We just need the right hooks and cache invalidation logic
so they can, and we're pretty close.

> Again, I'm not expressing an opinion for or against the change you
> propose, just mentioning another approach to the general problem.  I can
> see some advantages to waiting until an actual connection attempt to go
> create the role (you don't end up with roles in the system which never
> actually log into it) and advantages to using a background worker (the
> role will already be created, avoiding possible delay during the
> authentication and setup work of the role; more clear what roles have
> access on the system; ability to GRANT access to roles which haven't
> logged in yet or to set other attributes on the role prior to login).

Another advantage to asynchronous schemes is that you can use group
role names in pg_hba.conf (whereas that is checked before the auth
module is invoked, so won't work with my synchronous just-in-time
scheme; you need to use "all" in that case, though there is an easy
workaround using LDAP search_filter).

A disadvantage of your specific scheme is that it doesn't exist yet
and probably needs to be written in C, and my scheme (despite its
disadvantages) probably only requires slinging a few lines of
Python/fooscript/barscript around.

[1] https://docs.microsoft.com/en-us/windows/desktop/AD/overview-of-change-tracking-techniques
[2] https://datatracker.ietf.org/doc/draft-dawkins-ldapext-subnot/

-- 
Thomas Munro
http://www.enterprisedb.com

Re: Cache invalidation after authentication (on-the-fly role creation)

From

Tom Lane

Date:

12 July 2018, 21:52:09

Thomas Munro <thomas.munro@enterprisedb.com> writes:
> On Thu, Jul 5, 2018 at 9:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> That seems like a *really* ad-hoc place to put it.  Why should it be
>> there, and not (say) somewhere inside InitializeSessionUserId, or maybe
>> (also?) inside PerformAuthentication?  Why do the existing call sites for
>> AcceptInvalidationMessages --- in particular, the ones associated with
>> lock acquisition --- not solve the problem already?

> There is no lock acquisition involved here.  The sequence is:
>
> 1.  ClientAuthentication()->hba_getauthmethod()->check_hba()->get_role_oid()
> tries to look up user "fred" and doesn't find it (that lookup is used
> for group membership checks for hba line matching purposes, and not
> finding it here is not fatal, assuming you match with "all"); the
> cache now has a negative entry.
>
> 2.  The configured authentication method runs, and via a side
> connection it creates the role "fred".
>
> 3.  InitializeSessionUserId() looks up "fred", and finds the stale
> negative entry.

[ thinks about that for awhile... ]  So really this is an instance of
a generic problem, which is that the result of a syscache lookup could
be stale: it's not guaranteed to reflect changes that committed since
the last AcceptInvalidationMessages call, which in the worst case is
the start of the current transaction.  We have workarounds in place
that (mostly?) guarantee that relation-related lookups will get
sufficiently up-to-date data, but there's nothing much protecting other
catalogs such as pg_proc or pg_authid.

I can't help thinking we're going to need to fix that eventually.  But
I can't think of any easy way to do so.  Adding AcceptInvalidationMessages
to SearchSysCache itself is right out for performance reasons, I'm afraid,
unless we can somehow find a way to make the no-op path through it much
cheaper than it is now.

> I suppose the call to AcceptInvalidationMessages() could go at the end
> of ClientAuthentication().  That'd be closer to the code that creates
> the negative entry and immediately after the code that might modify
> the catalogs.  Or in PeformAuthentication(), its caller.  Or in
> IntializeSessionUserId() immediately before we try to look up the
> name, but that's also called by background workers that don't need it.

I think my preference would be to put it in InitializeSessionUserId
so that it's clearly associated with the syscache lookups we're trying
to protect.  I don't entirely buy your argument that background workers
don't need up-to-date info for this.

            regards, tom lane

Re: Cache invalidation after authentication (on-the-fly role creation)

From

Thomas Munro

Date:

13 July 2018, 07:23:02

On Fri, Jul 13, 2018 at 6:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@enterprisedb.com> writes:
>> I suppose the call to AcceptInvalidationMessages() could go at the end
>> of ClientAuthentication().  That'd be closer to the code that creates
>> the negative entry and immediately after the code that might modify
>> the catalogs.  Or in PeformAuthentication(), its caller.  Or in
>> IntializeSessionUserId() immediately before we try to look up the
>> name, but that's also called by background workers that don't need it.
>
> I think my preference would be to put it in InitializeSessionUserId
> so that it's clearly associated with the syscache lookups we're trying
> to protect.

Thanks.  Pushed to master that way.

With this change, I can successfully use a little PAM module that
authenticates with an LDAP server and then creates, grants and revokes
roles as necessary based on LDAP groups.  It's not as good as
Stephen's grand plan, but it's dead simple.  Without the change, it
still works, but the first login attempt for a given user fails.  I
can live with that in the back branches.

> I don't entirely buy your argument that background workers
> don't need up-to-date info for this.

Yeah, a hypothetical background worker that starts up and then waits
to be told the name of the role to use could suffer from this problem.

-- 
Thomas Munro
http://www.enterprisedb.com