Re: Again, sorry, caching. - Mailing list pgsql-hackers
From | mlw |
---|---|
Subject | Re: Again, sorry, caching. |
Date | |
Msg-id | 3C93585C.9E2E62B1@mohawksoft.com Whole thread Raw |
In response to | Again, sorry, caching. (mlw <markw@mohawksoft.com>) |
Responses |
Re: Again, sorry, caching.
|
List | pgsql-hackers |
Triggers and asynchronous notification are not substitutes for real hard ACID complient caching. The way you suggest implies only one access model. Take the notion of a library, they have both web and application access. These should both be able to use the cache. Also, your suggestion does not address the sub-select case, which I think is much bigger, performance wise, and more efficient than MySQL's cache. This whole discussion could be moot, and this could be developed as an extension, if there were a function API that could return sets of whole rows. Greg Copeland wrote: > > I previously replied to you vaguely describing a way you could implement > this by using a combination of client side caching and database tables > and triggers to allow you to determine if your cache is still valid. > Someone came right behind me, Tom maybe??, and indicated that the > proper/ideal way to do this would be to using postgres' asynchronous > database notification mechanisms (listen/notify I believe were the > semantics) to alert your application that your cache has become > invalid. Basically, a couple of triggers and the use of the list/notify > model, and you should be all set. > > Done properly, a client side cache which is asynchronously notified by > the database when it's contents become invalid should be faster than > relying on MySQL's database caching scheme. Basically, a strong client > side cache is going to prevent your database from even having to return > a cached result set while a database side cache is going to always > return a result set. Of course, one of the extra cool things you can do > is to cache a gzip'd copy of the data contents which would further act > as an optimization preventing the client or web server (in case they are > different) from having to recompress every result set. > > In the long run, again, if properly done, you should be able to beat > MySQL's implementation without too extra much effort. Why? Because a > client side cache can be much smarter in the way that it uses it's > cached contents much in the same way an application is able to better > cache it's data then what the file system is able to do. This is why an > client side cache should be preferred over that of a database result set > cache. > > Greg > > References: > http://www.postgresql.org/idocs/index.php?sql-notify.html > http://www.postgresql.org/idocs/index.php?sql-listen.html > > On Sat, 2002-03-16 at 08:01, mlw wrote: > > I traded a couple emails with a guy using one of my open source projects. To > > make a long story short, he is going to the new version of MySQL for his > > website because of the new caching feature. He is convinced that it will speed > > up his web site, and he is probably right. > > > > On a web site, a few specific queries get executed, unchanged, repeatedly. > > Think about an ecommerce site, most of the time it is just a handful of basic > > queries. These basic queries are usually against pretty large product tables. A > > caching mechanism would make these queries pretty light weight. > > > > The arguments against caching: > > > > "It is an application issue" > > This is completely wrong. Caching can not be done against a database without > > knowledge of the database, i.e. when the data changes. > > > > "If it is mostly static data, why not just make it a static page?" > > Because a static page is a maintenance nightmare. One uses a database in a web > > site to allow content to be changed and upgraded dynamically and with a minimum > > of work. > > > > "It isn't very useful" > > I disagree completely. A cache of most frequently used queries, or specific > > ones, could make for REALLY good performance in some very specific, but very > > common, applications. Any system that has a hierarchical "drill down" interface > > to a data set, ecommerce, libraries, document management systems, etc. will > > greatly benefit from a query cache. > > > > I was thinking that it could be implemented as a keyword or comment in a query. > > Such as: > > > > select * from table where column = 'foo' cacheable > > or > > select * from table where column = 'bar' /* cacheable */ > > > > Either way, it would speed up a lot of common application types. It would even > > be very cool if you could just cache the results of sub queries, such as: > > > > select * from (select * from table where col1 = 'foo' cacheable) as subset > > where subset.col2 = 'bar' ; > > > > Which would mean that the subquery gets cached, but the greater select need not > > be. The cache could be like a global temp table. Perhaps the user could even > > name the cache entry: > > > > select * from table where column = 'foo' cache on foo > > > > Where one could also do: > > > > select * from cache_foo > > > > Using a keyword is probably a better idea, it can be picked up by the parser > > and instruct PostgreSQL to use the cache, otherwise there will be no additional > > overhead. > > > > Having caching within PostgreSQL will be good for data integrity. Application > > caches can't tell when an update/delete/insert happens, they often have to use > > a time-out mechanism. > > > > OK, let me have it, tell me how terrible an idea this is. tell me how wrong I > > am. > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 3: if posting/reading through Usenet, please send an appropriate > > subscribe-nomail command to majordomo@postgresql.org so that your > > message can get through to the mailing list cleanly > > ------------------------------------------------------------------------------- > Name: signature.asc > signature.asc Type: application/pgp-signature > Description: This is a digitally signed message part
pgsql-hackers by date: