Re: Built-in connection pooling - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Built-in connection pooling |
Date | |
Msg-id | CA+TgmoZEhwHW1aJYE-MMUrT8yBshQODfKiTQTQbhv87fT5gxYQ@mail.gmail.com Whole thread Raw |
In response to | Re: Built-in connection pooling (Heikki Linnakangas <hlinnaka@iki.fi>) |
Responses |
Re: Built-in connection pooling
Re: Built-in connection pooling |
List | pgsql-hackers |
On Wed, Apr 18, 2018 at 9:41 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> Well, may be I missed something, but i do not know how to efficiently >> support >> 1. Temporary tables >> 2. Prepared statements >> 3. Sessoin GUCs >> with any external connection pooler (with pooling level other than >> session). > > Me neither. What makes it easier to do these things in an internal > connection pooler? What could the backend do differently, to make these > easier to implement in an external pooler? I think you are Konstantin are possibly failing to see the big picture here. Temporary tables, prepared statements, and GUC settings are examples of session state that users expect will be preserved for the lifetime of a connection and not beyond; all session state, of whatever kind, has the same set of problems. A transparent connection pooling experience means guaranteeing that no such state vanishes before the user ends the current session, and also that no such state established by some other session becomes visible in the current session. And we really need to account for *all* such state, not just really big things like temporary tables and prepared statements and GUCs but also much subtler things such as the state of the PRNG established by srandom(). This is really very similar to the problem that parallel query has when spinning up new worker backends. As far as possible, we want the worker backends to have the same state as the original backend. However, there's no systematic way of being sure that every relevant backend-private global, including perhaps globals added by loadable modules, is in exactly the same state. For parallel query, we solved that problem by copying a bunch of things that we knew were commonly-used (cf. parallel.c) and by requiring functions to be labeled as parallel-restricted if they rely on anything other state. The problem for connection pooling is much harder. If you only ever ran parallel-safe functions throughout the lifetime of a session, then you would know that the session has no "hidden state" other than what parallel.c already knows about (except for any functions that are mislabeled, but we can say that's the user's fault for mislabeling them). But as soon as you run even one parallel-restricted or parallel-unsafe function, there might be a global variable someplace that holds arbitrary state which the core system won't know anything about. If you want to have some other process take over that session, you need to copy that state to the new process; if you want to reuse the current process for a new session, you need to clear that state. Since you don't know it exists or where to find it, and since the code to copy and/or clear it might not even exist, you can't. In other words, transparent connection pooling is going to require some new mechanism, which third-party code will have to know about, for tracking every last bit of session state that might need to be preserved or cleared. That's going to be a big project. Maybe some of that can piggyback on existing infrastructure like InvalidateSystemCaches(), but there's probably still a ton of ad-hoc state to deal with. And no out-of-core pooler has a chance of handling all that stuff correctly; an in-core pooler will be able to do so only with a lot of work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: