Home > mailing lists
Re: Again, sorry, caching. - Mailing list pgsql-hackers

From	mlw
Subject	Re: Again, sorry, caching.
Date	March 16, 2002 09:37:10
Msg-id	3C93585C.9E2E62B1@mohawksoft.com Whole thread Raw
In response to	Again, sorry, caching. (mlw <markw@mohawksoft.com>)
Responses	Re: Again, sorry, caching.
List	pgsql-hackers
Tree view
Triggers and asynchronous notification are not substitutes for real hard ACID
complient caching. The way you suggest implies only one access model. Take the
notion of a library, they have both web and application access. These should
both be able to use the cache.

Also, your suggestion does not address the sub-select case, which I think is
much bigger, performance wise, and more efficient than MySQL's cache.

This whole discussion could be moot, and this could be developed as an
extension, if there were a function API that could return sets of whole rows.



Greg Copeland wrote:
> 
> I previously replied to you vaguely describing a way you could implement
> this by using a combination of client side caching and database tables
> and triggers to allow you to determine if your cache is still valid.
> Someone came right behind me, Tom maybe??, and indicated that the
> proper/ideal way to do this would be to using postgres' asynchronous
> database notification mechanisms (listen/notify I believe were the
> semantics) to alert your application that your cache has become
> invalid.  Basically, a couple of triggers and the use of the list/notify
> model, and you should be all set.
> 
> Done properly, a client side cache which is asynchronously notified by
> the database when it's contents become invalid should be faster than
> relying on MySQL's database caching scheme.  Basically, a strong client
> side cache is going to prevent your database from even having to return
> a cached result set while a database side cache is going to always
> return a result set.  Of course, one of the extra cool things you can do
> is to cache a gzip'd copy of the data contents which would further act
> as an optimization preventing the client or web server (in case they are
> different) from having to recompress every result set.
> 
> In the long run, again, if properly done, you should be able to beat
> MySQL's implementation without too extra much effort.  Why?  Because a
> client side cache can be much smarter in the way that it uses it's
> cached contents much in the same way an application is able to better
> cache it's data then what the file system is able to do.  This is why an
> client side cache should be preferred over that of a database result set
> cache.
> 
> Greg
> 
> References:
> http://www.postgresql.org/idocs/index.php?sql-notify.html
> http://www.postgresql.org/idocs/index.php?sql-listen.html
> 
> On Sat, 2002-03-16 at 08:01, mlw wrote:
> > I traded a couple emails with a guy using one of my open source projects. To
> > make a long story short, he is going to the new version of MySQL for his
> > website because of the new caching feature. He is convinced that it will speed
> > up his web site, and he is probably right.
> >
> > On a web site, a few specific queries get executed, unchanged, repeatedly.
> > Think about an ecommerce site, most of the time it is just a handful of basic
> > queries. These basic queries are usually against pretty large product tables. A
> > caching mechanism would make these queries pretty light weight.
> >
> > The arguments against caching:
> >
> > "It is an application issue"
> > This is completely wrong. Caching can not be done against a database without
> > knowledge of the database, i.e. when the data changes.
> >
> > "If it is mostly static data, why not just make it a static page?"
> > Because a static page is a maintenance nightmare. One uses a database in a web
> > site to allow content to be changed and upgraded dynamically and with a minimum
> > of work.
> >
> > "It isn't very useful"
> > I disagree completely. A cache of most frequently used queries, or specific
> > ones, could make for REALLY good performance in some very specific, but very
> > common, applications. Any system that has a hierarchical "drill down" interface
> > to a data set, ecommerce, libraries, document management systems, etc. will
> > greatly benefit from a query cache.
> >
> > I was thinking that it could be implemented as a keyword or comment in a query.
> > Such as:
> >
> > select * from table where column = 'foo' cacheable
> > or
> > select * from table where column = 'bar' /* cacheable */
> >
> > Either way, it would speed up a lot of common application types. It would even
> > be very cool if you could just cache the results of sub queries, such as:
> >
> > select * from (select * from table where col1 = 'foo' cacheable) as subset
> > where subset.col2 = 'bar' ;
> >
> > Which would mean that the subquery gets cached, but the greater select need not
> > be. The cache could be like a global temp table. Perhaps the user could even
> > name the cache entry:
> >
> > select * from table where column = 'foo' cache on foo
> >
> > Where one could also do:
> >
> > select * from cache_foo
> >
> > Using a keyword is probably a better idea, it can be picked up by the parser
> > and instruct PostgreSQL to use the cache, otherwise there will be no additional
> > overhead.
> >
> > Having caching within PostgreSQL will be good for data integrity. Application
> > caches can't tell when an update/delete/insert happens, they often have to use
> > a time-out mechanism.
> >
> > OK, let me have it, tell me how terrible an idea this is. tell me how wrong I
> > am.
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: if posting/reading through Usenet, please send an appropriate
> > subscribe-nomail command to majordomo@postgresql.org so that your
> > message can get through to the mailing list cleanly
> 
>   -------------------------------------------------------------------------------
>                        Name: signature.asc
>    signature.asc       Type: application/pgp-signature
>                 Description: This is a digitally signed message part
pgsql-hackers by date:
From: Greg Copeland
Date: 16 March 2002, 09:29:11
Subject: Re: Again, sorry, caching.
From: Greg Copeland
Date: 16 March 2002, 09:39:49
Subject: Re: Again, sorry, caching.
Re: Again, sorry, caching. - Mailing list pgsql-hackers

Previous

Next