Database Caching - Mailing list pgsql-hackers
From | Greg Sabino Mullane |
---|---|
Subject | Database Caching |
Date | |
Msg-id | E16gYpD-0007KY-00@mclean.mail.mindspring.net Whole thread Raw |
Responses |
Re: Database Caching
Re: Database Caching Re: Database Caching |
List | pgsql-hackers |
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I had to make a relatively long drive yesterday, so I had lots of free time to do some thinking...and my thoughts were turning to caching and databases. The following is what I came up with: forgive me if it seems to be just an obvious ramble... Why does a database need caching? Normally, when one thinks of a database (or to be precise, a RDBMS) the ACID acronym comes up. This is concerned with having a stable database that can reliably be used by many users at the same time. Caching a query is unintuitive because it involves sharing information from transactions that may be separated by a great amount of time and/or by different users. However, from the standpoint of the database server, caching increases efficiency enormously. If 800 users all make the same query, then caching can help the database server backend (hereafter simply "database") to save part or all of the work it performs so it doesn't have to repeat the entire sequence of steps 800 times. What is caching? Caching basically means that we want to save frequently-used information into an easy to get to area. Usually, this means storing it into memory. Caching has three main goals: reducing disk access, reducing computation (i.e. CPU utilization), and speeding up the time as measured by how long a it takes a user to seea result. It does all this at the expense of RAM, and the tradeoff is almost always worth it. In a database, there are three basic types of caching: query results, query plans, and relations. The first, query result caching, simply means that we store into memory the exact output of a SELECT query for the next time that somebody performs that exact same SELECT query. Thus, if 800 people do a "SELECT * FROM foo", the database runs it for the first person, saves the results, and simply reads the cache for the next 799 requests. This saves the database from doing any disk access, practically removes CPU usage, and speeds up the query. The second, query plan caching, involves saving the results of the optimizer, which is responsible for figuring out exactly "how" the databse is going to fetch the requested data. This type of caching usually involves a "prepared" query, which has almost all of the information needed to run the query with the exception of one or more "placeholders" (spots that are populated with variables at a later time). The query could also involve non-prepared statments as well. Thus, if someone prepares the query "SELECT flavor FROM foo WHERE size=?", and then executes it by sending in 300 different values for "size", the prepared statement is run through the optimizer, the r esulting path is stored into the query plan cache, and the stored path is used for the 300 execute requests. Because the path is already known, the optimizer does not need to be called, which saves the database CPU and time. The third, relation caching, simply involves putting the entire relation (usually a table or index) into memory so that it can be read quickly. This saves disk access, which basically means that it saves time. (This type of caching also can occur at the OS level, which caches files, but that will not be discussed here). Those are the three basic types of caching, ways of implementing each are discussed below. Each one should complement the other, and a query may be able to use one, two, or all three of the caches. I. Query result caching: A query result cache is only used for SELECT queries that involve a relation (i.e. not for "SELECT version") Each cache entry has the following fields: the query itself, the actual results, a status, an access time, an access number, and a list of all included columns. (The column list actually tells as much information as needed to uniquely identify it, i.e. schema, database, table, and column). The status is merely an indicator of whether or not this cached query is valid. It may not be, because it may be invalidated for a user within a transaction but still be of use to others. When a select query is processed, it is first parsed apart into a basic common form, stripping whitespace, standardizing case, etc., in order to facilitate an accurate match. Note that no other pre-processing is really required, since we are only interested in exact matches that produce the exact same output. An advanced version of this would ideally be able to use the cached output of "SELECT bar,baz FROM foo" when it receives the query "SELECT baz,bar FROM foo", but that will require some advanced parsing. Possible, but probably not something to attempt in the first iteration of a query caching function. :) If there *is* a match (via a simple strcmp at first), and the status is marked as "valid", then the database simply uses the stored output, updates the access time and count, and exits. This should be extremely fast, as no disk access is needed, and almost no CPU. The complexity of the query will not matter either: a simple query will run just as fast as something with 12 sorts and 28 joins. If a query is *not* already in the cache, then after the results are found and delivered to the user, the database will try and store them for the next appearance of that query. First, the size of the cache will be compared to the size of the query+output, to see if there is room for it. If there is, the query will be saved, with a status of valid, a time of 'now', a count of 1, a list of all affected columns found by parsing the query, and the total size of the query+output. If there is no room, then it will try to delete one or more to make room. Deleting can be done based on the oldest access time, smallest access count, or size of the query+output. Some balance of the first two would probably work best, with the access time being the most important. Everything will be configurable, of course. Whenever a table is changed, the cache must be checked as well. A list of all columns that were actually changed is computed and compared against the list of columns for each query. At the first sign of a match, the query is marked as "invalid." This should happen before the changes are made to the table itself. We do not delete the query immediately since this may be inside of a transaction, and subject to rollback. However, we do need to mark it as invalid for the current user inside the current transaction: thus, the status flag. When the transaction is commited, all queries that have an "invalid" flag are deleted, then the tables are changed. Since the only time a query can be flagged as "invalid" is inside your own transaction, the deletion can be done very quickly. II. Query plan caching If a query is not cached, then it "falls through" to the next level of caching, the query plan. This can either be automatic or strictly on a user-requested format (i.e. through the prepare-execute paradigm). The latter is probably better, but it also would not hurt much to store non-explicitly prepared queries in this cache as long as there is room. This cache has a field for the query itself, the plan to be followed (i.e. scan this table, that index, sort the results, then group them), the columns used, the access time, the access count, and the total size. It may also want a simple flag of "prepared or non-prepared", where prepared indicates an explicitly prepared statment that has placeholders for future values. A good optimizer will actually change the plan based on the values plugged in to the prepared queries, so that information should become a part of the query itself as needed, and multiple queries may exist to handle different inputs. In general, most of the inputs will be similar enough to use the same path (e.g. "SELECT flavor FROM foo WHERE size=?" will most usually result in a simple numeric value for the executes). If a match *is* found, then the database can use the stored path, and not have to bother calling up the optimizer to figure it out. It then updates the access time, the access count, and continues as normal. If a match was *not* found, then it might possibly want to be cached. Certainly, explicit prepares should always be cached. Non-explicitly prepared queries (those without placeholders) can also be cached. In theory, some of this will also be in the result cache, so that should be checked as well: it it is there, no reason to put it here. Prepared queries should always have priority over non-prepared, and the rest of the rules above for the result query should also apply, with a caveat that things that would affect the output of the optimizer (e.g. vacuuming) should also be taken into account when deleting entries. III. Relation caching The final cache is the relation itself, and simply involves putting the entire relation into memory. This cache has a field for the name of the relation, the table info itself, the type (indexes should ideally be cached more than tables, for example), the access time, and the acccess number. Loading could be done automatically, but most likely should be done according to a flag on the table itself or as an explicit command by the user. Notes: The "strcmp" used may seem rather crude, as it misses all but the exact same query, but it does cover most of the everyday cases. Queries are usually called through an application that keeps it in the same format, time after time, so the queries are very often exactly identical. A better parser would help, of course, but it would get rather complicated quickly. Two quick examples: a web page that is read from a database is a query that is called many times with exactly the same syntax; a user doing a "refresh" to constantly check if something has changed since they last looked. Sometimes a query may jump back to a previous type of cache, especially for things like subselects. The entire subselect query may not match, but the inner query should also be checked against the query result cache. Each cache should have some configurable parameters, including the size in RAM, the maximum number of entries, and rules for adding and deleting. They should also be directly viewable through a system table, so a DBA can quickly see exactly which queries are being cached and how often they are being used. There should be a command to quickly flush the cache, remove "old" entries, and to populate the query plan cache via a prepare statment. It should also be possible to do table changes without stopping to check the cache: perhaps flushing the cache and setting a global "cache is empty" flag would suffice. Another problem is the prepare and execute: you are not always guaranteed to get a cached prepare if you do an execute, as it may have expired or there may simply be no more room. Those prepared statements inside a transaction should probably be flagged as "non-deletable" until the transaction is ended. Storing the results of an execute in the query result cache is another problem. When a prepare is made, the database returns a link to that exact prepared statement in cache, so that all the client has to say is "run the query at 0x4AB3 with the value "12". Ideally, the database should be able to check these against the query result cache as well. It can do this by reconstructing the resulting query (by plugging the value into the prepared statement) or it can store the execute request as a type of query itself; instead of "SELECT baz FROM bar WHERE size=12" it would store "p0x4aB3:12". The result cache would probaby be the easiest to implement, and also gives the most "bang for the buck." The path cache may be a bit harder, but is a very necessary feature. I don't know about the relation caching: it looks to be fairly easy, and I don't trust that, so I am guessing it is actually quite difficult. Greg Sabino Mullane greg@turnstep.com PGP Key: 0x14964AC8 200202281132 -----BEGIN PGP SIGNATURE----- Comment: http://www.turnstep.com/pgp.html iD8DBQE8fqybvJuQZxSWSsgRAps9AKDwCkIH7GKSBjflyYSA0F7mQqD1MwCeJLCw hqE1SxJ2Z7RxFGCu3UwIBrI= =jlBy -----END PGP SIGNATURE-----
pgsql-hackers by date: