Re: Very busy 24x7 databases and VACUUM - Mailing list pgsql-admin
From | Christopher Browne |
---|---|
Subject | Re: Very busy 24x7 databases and VACUUM |
Date | |
Msg-id | m3k6rwm53s.fsf@knuth.knuth.cbbrowne.com Whole thread Raw |
In response to | Very busy 24x7 databases and VACUUM ("David F. Skoll" <dfs@roaringpenguin.com>) |
Responses |
Re: Very busy 24x7 databases and VACUUM
|
List | pgsql-admin |
In the last exciting episode, dfs@roaringpenguin.com ("David F. Skoll") wrote: > Does anyone run a very busy PostgreSQL datatabase, with lots of read > and write operations that run 24x7? (We're talking on the > neighbourhood of 40 to 60 queries/second, with probably 5% to 10% of > them being INSERT or UPDATE.) Yup... [Hand goes up...] > Some of our clients run such a DB, and the nightly VACUUM slows > things down tremendously while it is running. I see that in 8.0, > you can reduce the VACUUM's I/O impact, but from reading the code, > it also looks like that means the VACUUM will hold locks for longer, > which is probably bad news. Yes, there's a trade-off there. The "lazier" vacuum will indeed hold its locks longer. > Doing VACUUM more often than nightly is not a good idea; the tables > tend to be pretty large and it looks like VACUUM has to scan all the > tuples each time. (The nightly VACUUM is already taking several > hours in some cases.) Yes, vacuum does need to scan all the tuples. There's no shortcut at this point. > How do we handle this situation? Are there any plans for some kind > of "incremental" vacuum that recovers a few pages here and there in > the background? Is such a thing even possible? There has been some talk of a "VACUUM CACHE" idea, where the idea would be to walk through the shared buffer cache and vacuum just those pages. Recently updated pages ought to be in the cache, so we might expect this to be reasonably fruitful, as well as being rather quick. If that were to work out well, I would think it potentially fruitful to have a perhaps longer list of "pages of interest" whereby UPDATE/DELETE operations might throw pages that they touch into a queue for later re-examination. In a table that contains both "active regions" and large, seldom-updated "inactive regions," it would be nice to have a way to focus on the "active" bits. > If we defer some write operations until after the VACUUM has > finished, will that speed up the VACUUM? There are some things we > can save up until after VACUUM is finished. One thing that would be somewhat helpful would be to be sure that each of the tables that you are vacuuming is handled in a separate transaction. Supposing it's six tables that each take 1/2h to vacuum, if you do each in a separate transaction, some locks may get established over and over, but the "lease" will be dropped and renewed each half hour, which ought to be helpful. Any MVCC-relevant logic will just have 1/2h periods for which tuples are held onto instead of there being a big 3h "lock" put on them. That's not an explicit lock, but rather an inability to purge entries out... -- (format nil "~S@~S" "cbbrowne" "gmail.com") http://www.ntlug.org/~cbbrowne/rdbms.html If two people love each other, there can be no happy end to it. -- Hemingway
pgsql-admin by date: