Re: Database Caching - Mailing list pgsql-hackers
From | Justin Clift |
---|---|
Subject | Re: Database Caching |
Date | |
Msg-id | 3C7FBEC6.7F540E09@postgresql.org Whole thread Raw |
In response to | Re: Database Caching (Stephan Szabo <sszabo@megazone23.bigpanda.com>) |
Responses |
Re: Database Caching
|
List | pgsql-hackers |
Hi guys, Stephan Szabo wrote: <snip> > The question is, when it's invalidated, how does it become valid again? > I don't see that there's a way to do it only by query string that doesn't > result in meaning that the cache cannot cache a query again until any > transactions that can see the prior state are finished since otherwise > you'd be providing the incorrect results to that transaction. But I > haven't spent much time thinking about it either. It seems like a good idea to me, but only if it's optional. It could get in the way for systems that don't need it, but would be really beneficial for some types of systems which are read-only or mostly-read only (with consistent queries) in nature. i.e. Lets take a web page where clients can look up which of 10,000 records are either .biz, .org, .info, or .com. So, we have a database query of simply: SELECT name FROM sometable WHERE tld = 'biz'; And lets say 2,000 records come back, which are cached. Then the next query comes in, which is : SELECT name FROM sometable WHERE tld = 'info'; And lets say 3,000 records come back, which are also cached. Now, both of these queries are FULLY cached. So, if either query happens again, it's a straight memory read and dump, no disk activity involved, etc (very fast in comparison). Now, lets say a transaction which involves a change of "sometable" COMMITs. This should invalidate these results in the cache, as the viewpoint of the transaction could now be incorrect (there might now be less or more or different results for .info or .biz). The next queries will be cached too, and will keep upon being cached until the next transaction involving a change to "sometable" COMMITs. In this type of database access, this looks like a win. But caching results in this matter could be a memory killer for those applications which aren't so predictable in their queries, or are not so read-only. That's why I feel it should be optional, but I also feel it should be added due to what looks like massive wins without data integrity nor reliability issues. Hope this helps. :-) Regards and best wishes, Justin Clift > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi
pgsql-hackers by date: