Re: How to keep a table in memory? - Mailing list pgsql-hackers
From | Kevin Grittner |
---|---|
Subject | Re: How to keep a table in memory? |
Date | |
Msg-id | 4739C18E.EE98.0025.0@wicourts.gov Whole thread Raw |
In response to | Re: How to keep a table in memory? (Andrew Sullivan <ajs@crankycanuck.ca>) |
Responses |
Re: How to keep a table in memory?
|
List | pgsql-hackers |
>>> On Tue, Nov 13, 2007 at 2:05 PM, in message <20071113200508.GX11563@crankycanuck.ca>, Andrew Sullivan <ajs@crankycanuck.ca> wrote: > On Tue, Nov 13, 2007 at 02:36:14PM -0500, Greg Smith wrote: > >> I see this as similar to the old optimizer hint argument, where there >> certainly exist some edge cases where people know something the optimizer >> doesn't which changes the optimal behavior. > > . . .the abuse of such hints in applications I have seen is so rampant as to > make me doubt the utility of adding them anyway. It's true that by adding > hints, you give a facility to a good, competent designer who has a really > peculiar case that no general purpose system is likely to solve well. In > practice, however, it also seems to mean that every slack-jawed fool with > access to the manual thinks that he or she is going to "fix" the "broken" > query plan by forcing index scans where they're useless (has a week yet gone > by where someone doesn't post to -performance with that problem?). So I'm > divided on whether actually providing the facility is a good idea, even > though I can think of a handful of cases where I doubt even the smartest > planner will get it right. (By analogy, pinning in memory, and I'm > similarly divided.) I have trouble not seeing the point of any posts in this thread. Under our old, commercial database product, we had performance problems we addressed with a "named caches" feature -- you could declare a named cache of a particular size, and tweak some characteristics of it, then bind objects to it. We came up with several theories of how we could use them to improve on the default LRU logic, and carefully tested. About half of these ideas made things worse; about half made things better. We used only the ones that made things better for us with our usage patterns. Part of this involved using a cache small enough to fully contain all of the heavily referenced tables we bound to it. The proof of the benefit was that occasionally these settings got lost through errors in machine builds or upgrades. The users would start calling immediately, complaining about the performance; they were happy again when we restored the named cache configurations. The lack of such tuning knobs made me more than a little nervous as we moved toward switching to PostgreSQL, and I'm not sure that we couldn't use them if they were available; but, PostgreSQL performs so much better overall that it would be minimal compared to the improvement we saw switching to PostgreSQL. This leave me with sympathy for the concern from the original post, but feeling that I should join the crowd suggesting that its best to proceed on the assumption that such a tuning feature probably isn't needed: proceed without it and post any actual performance problems for advice. If you can kludge heavier caching for the objects in question and show an improvement in the metric which matters for your purposes, perhaps you can convince people it's a feature worth having, but expect that people will want to see details and explore alternative solutions. -Kevin
pgsql-hackers by date: