From 8ea9aae6e5e1c7482cacf19eb1930ec0bf916ab8 Mon Sep 17 00:00:00 2001 From: Peter Geoghegan Date: Sat, 22 Apr 2023 12:33:42 -0700 Subject: [PATCH v2 9/9] Overhaul "Recovering Disk Space" vacuuming docs. Say a lot more about the possible impact of long-running transactions on VACUUM. Remove all talk of administrators getting by without autovacuum; at most administrators might want to schedule manual VACUUM operations to supplement autovacuum (this documentation was written at a time when the visibility map didn't exist, even in its most basic form). Also describe VACUUM FULL as an entirely different kind of operation to conventional lazy vacuum. --- doc/src/sgml/maintenance.sgml | 173 ++++++++++++++++++---------------- 1 file changed, 93 insertions(+), 80 deletions(-) diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml index 675f6945d..0920855ae 100644 --- a/doc/src/sgml/maintenance.sgml +++ b/doc/src/sgml/maintenance.sgml @@ -342,100 +342,113 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu This approach is necessary to gain the benefits of multiversion concurrency control (MVCC, see ): the row version must not be deleted while it is still potentially visible to other - transactions. But eventually, an outdated or deleted row version is no - longer of interest to any transaction. The space it occupies must then be - reclaimed for reuse by new rows, to avoid unbounded growth of disk - space requirements. This is done by running VACUUM. + transactions. A deleted row version (whether from an + UPDATE or DELETE) will + usually cease to be of interest to any still running transaction + shortly after the original deleting transaction commits. - The standard form of VACUUM removes dead row - versions in tables and indexes and marks the space available for - future reuse. However, it will not return the space to the operating - system, except in the special case where one or more pages at the - end of a table become entirely free and an exclusive table lock can be - easily obtained. In contrast, VACUUM FULL actively compacts - tables by writing a complete new version of the table file with no dead - space. This minimizes the size of the table, but can take a long time. - It also requires extra disk space for the new copy of the table, until - the operation completes. + The space dead tuples occupy must eventually be reclaimed for + reuse by new rows, to avoid unbounded growth of disk space + requirements. Reclaiming space from dead rows is + VACUUM's main responsibility. - The usual goal of routine vacuuming is to do standard VACUUMs - often enough to avoid needing VACUUM FULL. The - autovacuum daemon attempts to work this way, and in fact will - never issue VACUUM FULL. In this approach, the idea - is not to keep tables at their minimum size, but to maintain steady-state - usage of disk space: each table occupies space equivalent to its - minimum size plus however much space gets used up between vacuum runs. - Although VACUUM FULL can be used to shrink a table back - to its minimum size and return the disk space to the operating system, - there is not much point in this if the table will just grow again in the - future. Thus, moderately-frequent standard VACUUM runs are a - better approach than infrequent VACUUM FULL runs for - maintaining heavily-updated tables. - - - - Some administrators prefer to schedule vacuuming themselves, for example - doing all the work at night when load is low. - The difficulty with doing vacuuming according to a fixed schedule - is that if a table has an unexpected spike in update activity, it may - get bloated to the point that VACUUM FULL is really necessary - to reclaim space. Using the autovacuum daemon alleviates this problem, - since the daemon schedules vacuuming dynamically in response to update - activity. It is unwise to disable the daemon completely unless you - have an extremely predictable workload. One possible compromise is - to set the daemon's parameters so that it will only react to unusually - heavy update activity, thus keeping things from getting out of hand, - while scheduled VACUUMs are expected to do the bulk of the - work when the load is typical. - - - - For those not using autovacuum, a typical approach is to schedule a - database-wide VACUUM once a day during a low-usage period, - supplemented by more frequent vacuuming of heavily-updated tables as - necessary. (Some installations with extremely high update rates vacuum - their busiest tables as often as once every few minutes.) If you have - multiple databases in a cluster, don't forget to - VACUUM each one; the program might be helpful. + The XID cutoff point that VACUUM uses to + determine whether or not a deleted tuple is safe to physically + remove is reported under removable cutoff in + the server log when autovacuum logging (controlled by ) reports on a + VACUUM operation executed by autovacuum. + Tuples that are not yet safe to remove are counted as + dead but not yet removable tuples in the log + report. VACUUM establishes its + removable cutoff once, at the start of the + operation. Any older snapshot (or transaction that allocates an + XID) that's still running when the cutoff is established may hold + it back. - - Plain VACUUM may not be satisfactory when - a table contains large numbers of dead row versions as a result of - massive update or delete activity. If you have such a table and - you need to reclaim the excess disk space it occupies, you will need - to use VACUUM FULL, or alternatively - CLUSTER - or one of the table-rewriting variants of - ALTER TABLE. - These commands rewrite an entire new copy of the table and build - new indexes for it. All these options require an - ACCESS EXCLUSIVE lock. Note that - they also temporarily use extra disk space approximately equal to the size - of the table, since the old copies of the table and indexes can't be - released until the new ones are complete. - + + It's important that no long running transactions ever be allowed + to hold back every VACUUM operation's cutoff + for an extended period. You may wish to monitor this. + - + + + Tuples inserted by aborted transactions can be removed by + VACUUM immediately + + + - If you have a table whose entire contents are deleted on a periodic - basis, consider doing it with - TRUNCATE rather - than using DELETE followed by - VACUUM. TRUNCATE removes the - entire content of the table immediately, without requiring a - subsequent VACUUM or VACUUM - FULL to reclaim the now-unused disk space. - The disadvantage is that strict MVCC semantics are violated. + VACUUM will not return space to the operating + system, except in the special case where a group of contiguous + pages at the end of a table become entirely free and an exclusive + table lock can be easily obtained. This relation truncation + behavior can be disabled in tables where the exclusive lock is + disruptive by setting the table's vacuum_truncate + storage parameter to off. + + + + If you have a table whose entire contents are deleted on a + periodic basis, consider doing it with TRUNCATE rather + than relying on VACUUM. + TRUNCATE removes the entire contents of the + table immediately, avoiding the need to set + xmax to the deleting transaction's XID. + One disadvantage is that strict MVCC semantics are violated. + + + + VACUUM FULL or CLUSTER can + be useful when dealing with extreme amounts of dead tuples. It + can reclaim more disk space, but runs much more slowly. It + rewrites an entire new copy of the table and rebuilds all indexes. + This typically has much higher overhead than + VACUUM. Generally, therefore, administrators + should avoid using VACUUM FULL except in the + most extreme cases. + + + + + Although VACUUM FULL is technically an option + of the VACUUM command, VACUUM + FULL uses a completely different implementation. + VACUUM FULL is essentially a variant of + CLUSTER. (The name VACUUM + FULL is historical; the original implementation was + somewhat closer to standard VACUUM.) + + + + + TRUNCATE, VACUUM FULL, and + CLUSTER all require an ACCESS + EXCLUSIVE lock, which can be highly disruptive + (SELECT, INSERT, + UPDATE, and DELETE commands + will all be blocked). + + + + + VACUUM FULL (and CLUSTER) + temporarily uses extra disk space approximately equal to the size + of the table, since the old copies of the table and indexes can't + be released until the new ones are complete. + + -- 2.40.0