Re: Admission Control - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Admission Control |
Date | |
Msg-id | AANLkTingRgi8PoFRdM0eJVNJpXVosHNdqAK7yUnjM-b9@mail.gmail.com Whole thread Raw |
In response to | Admission Control ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>) |
Responses |
Re: Admission Control
Re: Admission Control Re: Admission Control |
List | pgsql-hackers |
On Fri, Jun 25, 2010 at 1:33 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Recent discussions involving the possible benefits of a connection > pool for certain users has reminded me of a brief discussion at The > Royal Oak last month, where I said I would post a reference a > concept which might alleviate the need for external connection > pools. For those interested, check out section 2.4 of this > document: > > Joseph M. Hellerstein, Michael Stonebraker and James Hamilton. 2007. > Architecture of a Database System. Foundations and Trends(R) in > Databases Vol. 1, No. 2 (2007) 141*259. > > http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf I think a good admission control system for memory would be huge for us. There are innumerable threads on pgsql-performance where we tell people to set work_mem to a tiny value (like 4MB or 16MB) because any higher value risks driving the machine into swap in the event that they get an unusually large number of connections or those connections issue queries that require an unusual number of hashes or sorts. There are also innumerable postings from people complaining that external sorts are way slower than in-memory sorts, and of course a hash join frequently mops the floor with a nested loop with inner index-scan. A really trivial admission control system might let you set a system-wide limit on work_mem. As we build a plan, we could estimate the total amount of work_mem it will require by examining all the sort, hash, and hash aggregate nodes it contains. In shared memory, we keep a total of this value across all back-ends. Just before beginning to execute a plan that uses >0 work_mem, we bump this value up by the value for the current plan, unless that would make us exceed the system-wide limit. In that case, we sleep, and then next person to finish executing (and decrease the value in shared memory) will wake us up to try again. (Ideally, we'd want to make maintenance_work_mem part of this accounting process also; otherwise, a system that was humming along nicely might suddenly start thrashing when a maintenance operation kicks off.) I suppose this would take a good deal of performance testing to see how well it actually works. A refinement might be to try to consider an inferior plan that uses less memory when the system is tight on memory, rather than waiting. But you'd have to be careful about that, because waiting might be better (it's worth waiting 15 s if it means the execution time will decrease by > 15 s). The idea of doling out queries to engine processes in an interesting one, but seems very different than our current query execution model. I can't even begin to speculate as to whether there's anything interesting we could do in that area without reading some literature on the topic - got any pointers? But even if we can't or don't want to do that, we already know that limiting the number of backends and round-robining queries among them performs MUCH better that setting max_connections to a large value, and not just because of memory exhaustion. Our answer to that problem is "use a connection pooler", but that's not a very good solution even aside from the possible administrative nuisance, because it only solves half the problem. In the general case, the question is not whether we can currently afford to allocate 0 or 1 backends to a given query, but whether we can afford to allocate 0, 1, or >1; furthermore, if we can afford to allocate >1 backend, we'd ideally like to reuse an existing backend rather than starting a new one. I don't think anyone's going to be too happy with a parallel query implementation with a dependency on an external connection poooler. One of the difficulties in reusing an existing backend for a new query, or in maintaining a pool of backends that could be used as workers for parallel queries, is that our current system does not require us to have, nor do we have, a way of resetting a backend to a completely pristine state. DISCARD ALL is probably pretty close, because AIUI that's what connection poolers are using today, and if it didn't work reasonably well, we'd have gotten complaints. But DISCARD ALL will not let you rebind to a different database, for example, which would be a pretty useful thing to do in a pooling environment, so that you needn't maintain separate pools for each database, and it doesn't let you restart the authentication cycle (with no backsies) either. Of course, addressing these problems wouldn't by itself give us a built-in connection pooler or parallel query execution, and there's some danger of destabilizing the code, but possibly it would be a good way to get off the ground. Not sure, though. Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
pgsql-hackers by date: