Re: Again, sorry, caching. - Mailing list pgsql-hackers
From | Ross J. Reedstrom |
---|---|
Subject | Re: Again, sorry, caching. |
Date | |
Msg-id | 20020319181252.GI31839@rice.edu Whole thread Raw |
In response to | Re: Again, sorry, caching. (Neil Conway <nconway@klamath.dyndns.org>) |
Responses |
Re: Again, sorry, caching.
|
List | pgsql-hackers |
On Mon, Mar 18, 2002 at 09:35:51PM -0500, Neil Conway wrote: > > > > "It is an application issue" > > This is completely wrong. Caching can not be done against a database without > > knowledge of the database, i.e. when the data changes. > > But can't this be achieved by using a LISTEN/NOTIFY model, with > user-created rules to NOTIFY the appropriate listener when a table > changes? With a good notification scheme like this, you don't need to > continually poll the DB for changes. You don't need to teach your cache > a lot of things about the database, since most of that knowledge is > encapsulated inside the rules, and supporting tables. > > My impression (I could be wrong) is that LISTEN/NOTIFY doesn't get the > press that it deserves. If this model isn't widely used because of some > deficiencies in the LISTEN/NOTIFY implementation, IMHO our time would be > better spent fixing those problems than implementing the proposed > caching scheme. > > If we're looking to provide a "quick and easy" caching scheme for users > attracted to MySQL's query cache, why not provide this functionality > through another application? I'm thinking about a generic "caching > layer" that would sit in between Postgres and the database client. It > could speak the FE/BE protocol as necessary; it would use LISTEN/NOTIFY > to allow it to efficiently be aware of database changes; it would create > the necessary rules for the user, providing a simple interface to > enabling query caching for a table or a set of tables? > > What does everyone think? Neil, this sounds like exactly the approach to follow up on: the one part of caching that _is_ the backends domain is knowing about invalidation events. And LISTEN/NOTIFY has _exactly_ the right behavior for that - you don't get out of transaction NOTIFYs, for example. As it stands, the application developer has to have intimate knowledge of the schema to set up the correct NOTIFY triggers for any given query. This works against developing a generic middleware solution, since one would have to parse the SQL to guess at the affected tables. How about an extension that autocreates INSERT/UPDATE/DELETE triggers that send NOTIFYs, based on all tables accessed by a given SELECT? As an example, I could see extending the Zope PsycoPG database adaptor, (which already tries some simple timeout based caching) to tack on something like: SELECT foo,bar FROM baz CACHENOTIFY <notifyname> whenever it creates a cache fora given query, then setting up the correct LISTEN to invalidate that cache. Alternatively, the LISTEN could be automatic. The name might be autogenerated, as well, to avoid collision probelms. Or perhaps _allow_ collisions to extend the notification set? (I could see setting _all_ the queries that generate one web page to NOTIFY together, since the entire page needs to be regenerated on cache invalidation) Then, the existing interface to SQL queries would allow the app developer to set logical caching policies for each query, independently. The backend does only the part that it alone can do: determine all the tables touched by a given query. The middleware and app developer are then free to cache at the appropriate level (SQL result set, fully formatted web page, etc.) This clearly is only useful in a connection pooling environment, so the long lived backends are around to receive the NOTIFYs. Hmm, no, I think it would be possible with this to have a seperate process do the LISTEN and cache invalidation, while a pool of other backends are used for general access, no? Seems like a win all around. Anyone else have comments? How insane would the auto trigger creation get? It seems to me that this would be similar in spirit to the referential integrity work, but more dynamic, since simple SELECTs would be creating backend triggers. Potential for DOS attacks, for ex. but not much worse I suppose than firing off big nasty cartesian cross product queries. Ross
pgsql-hackers by date: