Re: Postgresql Caching - Mailing list pgsql-hackers
From | Anon Mous |
---|---|
Subject | Re: Postgresql Caching |
Date | |
Msg-id | 20061017235158.11524.qmail@web90305.mail.mud.yahoo.com Whole thread Raw |
In response to | Postgresql Caching (Anon Mous <soundami@yahoo.com>) |
Responses |
Re: Postgresql Caching
|
List | pgsql-hackers |
Hi<br /><br /> I've loved reading all of your thoughts and comments.<br /><br /> Yet, I'm left with the question:<br /><br/> Can we can brainstorm a caching solution that is workable...<br /><br /> I've seen some posts talking about someof the challenges.<br /><br /> 1.) Only good for static data<br /><br /> As it was proposed that is largely true. Thisdoesn't mean, however that the idea as stated isn't worth persuing because I think you'll find most applications havea great deal of "static enough" data to benefit greatly from this type of caching.<br /><br /> However, I think somesimple changes to the idea may make it useful for busy tables...<br /><br /> These changes, would probably require directcommunication between the caching controller and the the postmaster.<br /><br /> a.) Rather than table locking, trackchanges at the row level.<br /> b.) Rather than requiring a complete reseeding of a table after an update, just invalidate,or repopulate the affected rows. <br /> c.) Rather than destroying popular query results, try to update them ifpossible. For example, it's easy to remove one entry from the cache if that row was just deleted. (It's probably cheaperto rerun the query on just the few changed rows than to rerun the whole query.)<br /> d.) Any other ideas?<br /><br/> 2.) If any caching were possible, we'd already be doing it.<br /><br /> I don't think this statement will stand thetest of time! Agreed, caching quickly becomes a head hurting complex topic, but the benefits are well worth the effort!<br/><br /> 3.) Consistency and memcached, *are* mutually exclusive. Memcached provides no mechanisms for consistency.<br /><br /> "You can never have perfect consistency across different systems (memcache / <br /> postgresql)and especially not when their visibility rules differ. What is <br /> visible to something via memcache is alwayslatest uncommitted. What is <br /> visible in PostgreSQL is something less than that. Consistency is not <br /> possible.Correct caching is therefore also not possible unless you define <br /> correct as 'latest', and even then, youhave problems if memcache expires <br /> the record, before the real record has been commited into PostgreSQL."<br /><br/> I completely agree. I'm not talking about synchronizing memcached data to be consistent, I'm saying lets create acaching layer that works something like memcache and preserves consistency!<br /><br /> and, very related to this, <br />4.) Memcached Caching is exactly opposite to Postgres consistency.<br /><br /> Specifically:<br /> Memcache is serialized<br/> Memcache can loose data at any time<br /> Memcache has only 2 fields<br /> Memcache has no synchronization<br/><br /> Postgres needs consistency.<br /><br /> Memcache doesn't do any synchronization, and that meansconsistency is impossible. However, a special version of memcache that is embedded into the postgresql system or apithat does talk with the postmaster could be able to provide guaranteed consistency?<br /><br /> 5.) This idea won't saveany time with SQL parsing.<br /><br /> I believe it can... Because, as memcache has illustrated, you can avoid any sqlparsing by using the sql and user's table permissions (if different users are enabled) as the key to the cached data.<br /><br /> 6.) Postgresql is consistency. If an application needs speed let the application figure out how to cachethe data<br /><br /> I appreciate that Postgres is all about data consistency. Actually, that's why I'm here and noton the Mysql board... However, I believe that we can provide caching without losing consistency, and developers willlove the extra speed.<br /><br /> If we do implement the caching once, everyone will be able to use it without complicatingtheir end application!!! (Read: It will help the world and make PostgreSQL very popular!)<br /><br /> ---<br/><br /> So, would it work to create a caching pre-processor for Postgresql that would work serially on every request,and pass all uncached queries to the database?<br /><br /> - If it's a cache hit, and the data is currently availableand active, pass the data back. <br /><br /> - If it's a miss, pass the query along to the database, and populatethe cache with the results. <br /><br /> - If the query changes data, invalidate the cached queries that touch anytable rows that could be affected. After the update, re-enable the unaffected table rows, and repopulate the cache withthe updates. <br /><br /> - Cached queries using an affected table would normally be deleted except in special simplecases that could be updated. <br /><br /> A related interesting thought... It might help if the cache stored the datain separate table rows rather than in whatever format the query requested the way memcached does. <br /><br /> - Eachcached table row could be stored as a separate entity. <br /> - Complicated joins, rather than caching all of the datain whatever organization the user specified, would instead store a matrix of pointers to the exact table fields in mostcases.<br /><br /> Will it work? Am I missing anything?<br /><br /> Thanks<br /><br /> Daniel<br /><p><hr size="1" />Yahoo!Messenger with Voice. <a href="http://us.rd.yahoo.com/mail_us/taglines/postman1/*http://us.rd.yahoo.com/evt=39663/*http://voice.yahoo.com">Make PC-to-PhoneCalls</a> to the US (and 30+ countries) for 2¢/min or less.
pgsql-hackers by date: