From 8f497df592b6d0123c4a17e0e5ccb9cbd403cb0c Mon Sep 17 00:00:00 2001 From: Peter Geoghegan Date: Sat, 22 Apr 2023 13:04:13 -0700 Subject: [PATCH v2 8/9] Overhaul freezing and wraparound docs. This is almost a complete rewrite. "Preventing Transaction ID Wraparound Failures" becomes "Freezing to manage the transaction ID space". This is follow-up work to commit 1de58df4, which added page-level freezing to VACUUM. The emphasis is now on the physical work of freezing pages. This flows a little better than it otherwise would due to recent structural cleanups to maintenance.sgml; discussion about freezing now immediately follows discussion of cleanup of dead tuples. We still talk about the problem of the system activating xidStopLimit protections in the same section, but we use much less alarmist language about data corruption, and are no longer overly concerned about the very worst case. We don't rescind the recommendation that users recover from an xidStopLimit outage by using single user mode, though that seems like something we should aim to do in the near future. There is no longer a separate sect3 to discuss MultiXactId related issues. VACUUM now performs exactly the same processing steps when it freezes a page, independent of the trigger condition. Also describe the page-level freezing FPI optimization added by commit 1de58df4. This is expected to trigger the majority of all freezing with many types of workloads. --- doc/src/sgml/config.sgml | 20 +- doc/src/sgml/logicaldecoding.sgml | 2 +- doc/src/sgml/maintenance.sgml | 738 ++++++++++++++-------- doc/src/sgml/ref/create_table.sgml | 2 +- doc/src/sgml/ref/prepare_transaction.sgml | 2 +- doc/src/sgml/ref/vacuum.sgml | 6 +- doc/src/sgml/ref/vacuumdb.sgml | 4 +- doc/src/sgml/xact.sgml | 4 +- 8 files changed, 514 insertions(+), 264 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index b56f073a9..fa825b5f1 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -8359,7 +8359,7 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; Note that even when this parameter is disabled, the system will launch autovacuum processes if necessary to prevent transaction ID wraparound. See for more information. + linkend="freezing-xid-space"/> for more information. @@ -8548,7 +8548,7 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; This parameter can only be set at server start, but the setting can be reduced for individual tables by changing table storage parameters. - For more information see . + For more information see . @@ -8577,7 +8577,7 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; 400 million multixacts. This parameter can only be set at server start, but the setting can be reduced for individual tables by changing table storage parameters. - For more information see . + For more information see . @@ -9284,7 +9284,7 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; periodic manual VACUUM has a chance to run before an anti-wraparound autovacuum is launched for the table. For more information see - . + . @@ -9306,7 +9306,7 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; the value of , so that there is not an unreasonably short time between forced autovacuums. For more information see . + linkend="freezing-xid-space"/>. @@ -9343,7 +9343,8 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; set this value anywhere from zero to 2.1 billion, VACUUM will silently adjust the effective value to no less than 105% of . + linkend="guc-autovacuum-freeze-max-age"/>. For more + information see . @@ -9367,7 +9368,7 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; , so that a periodic manual VACUUM has a chance to run before an anti-wraparound is launched for the table. - For more information see . + For more information see . @@ -9388,7 +9389,7 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; the value of , so that there is not an unreasonably short time between forced autovacuums. - For more information see . + For more information see . @@ -9421,7 +9422,8 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; this value anywhere from zero to 2.1 billion, VACUUM will silently adjust the effective value to no less than 105% of . + linkend="guc-autovacuum-multixact-freeze-max-age"/>. For more + information see . diff --git a/doc/src/sgml/logicaldecoding.sgml b/doc/src/sgml/logicaldecoding.sgml index cbd3aa804..80dade3be 100644 --- a/doc/src/sgml/logicaldecoding.sgml +++ b/doc/src/sgml/logicaldecoding.sgml @@ -353,7 +353,7 @@ postgres=# select * from pg_logical_slot_get_changes('regression_slot', NULL, NU because neither required WAL nor required rows from the system catalogs can be removed by VACUUM as long as they are required by a replication slot. In extreme cases this could cause the database to shut down to prevent - transaction ID wraparound (see ). + transaction ID wraparound (see ). So if a slot is no longer required it should be dropped. diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml index 7476e5922..675f6945d 100644 --- a/doc/src/sgml/maintenance.sgml +++ b/doc/src/sgml/maintenance.sgml @@ -275,15 +275,21 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu - To protect against loss of very old data due to - transaction ID wraparound or - multixact ID wraparound. + To maintain the system's ability to allocated new + transaction IDs (and new multixact IDs) through freezing. To update the visibility map, which speeds up index-only - scans. + scans, and helps the next VACUUM + operation avoid needlessly scanning pages that are already + frozen + + + + To truncate obsolescent transaction status information, + when possible @@ -432,303 +438,491 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu - - Preventing Transaction ID Wraparound Failures - - - transaction ID - wraparound - + + Freezing to manage the transaction ID space - wraparound - of transaction IDs + Freezing + of transaction IDs and multixact IDs - PostgreSQL's MVCC transaction semantics depend on - being able to compare transaction - ID numbers (XID) to determine - whether or not the row is visible to each query's MVCC snapshot - (see - interpreting XID stamps from tuple headers). But since - on-disk storage of transaction IDs in heap pages uses a truncated - 32-bit representation to save space (rather than the full 64-bit - representation), it is necessary to vacuum every table in every - database at least once every two billion - transactions (though far more frequent vacuuming is typical). + VACUUM often marks certain pages + frozen, indicating that all eligible rows on + the page were inserted by a transaction that committed + sufficiently far in the past that the effects of the inserting + transaction are certain to be visible to all current and future + transactions. The specific transaction ID number + (XID) stored in a frozen heap + row's xmin field is no longer needed to + determine anything about the row's visibility. + Furthermore, when a row undergoing freezing happens to have an XID + set in its xmax field (possibly an XID + left behind by an earlier SELECT FOR UPDATE row + locker), the xmax field's XID is + typically also removed (actually, xmax + is set to the special XID value 0, also known + as InvalidTransactionId). See for further background + information. - - controls how old an XID value has to be before rows bearing that XID will be - frozen. Increasing this setting may avoid unnecessary work if the - rows that would otherwise be frozen will soon be modified again, - but decreasing this setting increases - the number of transactions that can elapse before the table must be - vacuumed again. + Once frozen, heap pages are self-contained. All of + the page's rows can be read by every transaction, without any + transaction ever needing to consult externally stored transaction + status metadata (most notably, transaction commit/abort status + information from pg_xact won't ever be + required). - VACUUM uses the visibility map - to determine which pages of a table must be scanned. Normally, it - will skip pages that don't have any dead row versions even if those pages - might still have row versions with old XID values. Therefore, normal - VACUUMs won't always freeze every old row version in the table. - When that happens, VACUUM will eventually need to perform an - aggressive vacuum, which will freeze all eligible unfrozen - XID and MXID values, including those from all-visible but not all-frozen pages. - In practice most tables require periodic aggressive vacuuming. - - controls when VACUUM does that: all-visible but not all-frozen - pages are scanned if the number of transactions that have passed since the - last such scan is greater than vacuum_freeze_table_age minus - vacuum_freeze_min_age. Setting - vacuum_freeze_table_age to 0 forces VACUUM to - always use its aggressive strategy. + It can be useful for VACUUM to put off some of + the work of freezing, but freezing cannot be put off indefinitely. + Since on-disk storage of transaction IDs in heap row headers uses + a truncated 32-bit representation to save space (rather than the + full 64-bit representation), freezing plays a crucial role in + enabling management of the XID + address space by VACUUM. If freezing + by VACUUM is somehow impeded (in a database + that continues to allocate new transaction IDs), the system will + eventually refuse to allocate new + transaction IDs. This generally only happens in extreme + cases where the system has been misconfigured. - The maximum time that a table can go unvacuumed is two billion - transactions minus the vacuum_freeze_min_age value at - the time of the last aggressive vacuum. If it were to go - unvacuumed for longer than - that, data loss could result. To ensure that this does not happen, - autovacuum is invoked on any table that might contain unfrozen rows with - XIDs older than the age specified by the configuration parameter . (This will happen even if - autovacuum is disabled.) + can be used to control + when freezing takes place. When VACUUM scans a + heap page containing even one XID that already attained an age + exceeding this value, the page is frozen. + + + + MultiXactId + Freezing of + + + + Multixact IDs are used to support row + locking by multiple transactions. Since there is only limited + space in a tuple header to store lock information, that + information is encoded as a multiple transaction + ID, or multixact ID for short, whenever there is more + than one transaction concurrently locking a row. Information + about which transaction IDs are included in any particular + multixact ID is stored separately, and only the multixact ID + appears in the xmax field in the tuple + header. Like transaction IDs, multixact IDs are implemented as a + 32-bit counter and corresponding storage. Since MultiXact IDs are + stored in the xmax field of heap rows + (and have an analogous dependency on external transaction status + information), they may also need to be removed during freezing. - This implies that if a table is not otherwise vacuumed, - autovacuum will be invoked on it approximately once every - autovacuum_freeze_max_age minus - vacuum_freeze_min_age transactions. - For tables that are regularly vacuumed for space reclamation purposes, - this is of little importance. However, for static tables - (including tables that receive inserts, but no updates or deletes), - there is no need to vacuum for space reclamation, so it can - be useful to try to maximize the interval between forced autovacuums - on very large static tables. Obviously one can do this either by - increasing autovacuum_freeze_max_age or decreasing - vacuum_freeze_min_age. + also + controls when freezing takes place. It is analogous to + vacuum_freeze_min_age, but age + is expressed in units of Multixact ID (not in units of XID). + vacuum_multixact_freeze_min_age typically has + only a minimal impact on how many pages are frozen, partly because + VACUUM usually prefers to remove MultiXact IDs + proactively based on low-level considerations around the cost of + freezing. vacuum_multixact_freeze_min_age + forces VACUUM to process + MultiXact IDs in certain rare cases where the implementation would + not ordinarily do so. - The effective maximum for vacuum_freeze_table_age is 0.95 * - autovacuum_freeze_max_age; a setting higher than that will be - capped to the maximum. A value higher than - autovacuum_freeze_max_age wouldn't make sense because an - anti-wraparound autovacuum would be triggered at that point anyway, and - the 0.95 multiplier leaves some breathing room to run a manual - VACUUM before that happens. As a rule of thumb, - vacuum_freeze_table_age should be set to a value somewhat - below autovacuum_freeze_max_age, leaving enough gap so that - a regularly scheduled VACUUM or an autovacuum triggered by - normal delete and update activity is run in that window. Setting it too - close could lead to anti-wraparound autovacuums, even though the table - was recently vacuumed to reclaim space, whereas lower values lead to more - frequent aggressive vacuuming. + Managing the added WAL volume from freezing + over time is an important consideration for + VACUUM. This is why VACUUM + doesn't just freeze every eligible tuple at the earliest + opportunity: the WAL written to freeze a page's + tuples goes to waste in cases where the resulting + frozen tuples are soon deleted or updated anyway. It's also why + VACUUM will freeze all + eligible tuples from a heap page once the decision to freeze at + least one tuple is taken: at that point the added cost to freeze + all eligible tuples eagerly (measured in extra bytes of + WAL written) is far lower than the + probable cost of deferring freezing until a future + VACUUM operation against the same table. + Furthermore, once the page is frozen it can be marked all-frozen + in the visibility map right away. + + + + + In PostgreSQL versions before 16, + freezing was triggered at the level of individual + xmin and + xmax fields. Freezing only affected + the exact XIDs that had already attained an age at or exceeding + vacuum_freeze_min_age, regardless of costs. + + + + + VACUUM also triggers freezing of pages in cases + where it already proved necessary to write out an + FPI (full page image) alongside a + WAL record generated while removing dead tuples + (see for background information + about how FPIs provide torn page protection). + This freeze on an FPI write + mechanism is designed to lower the absolute volume of + WAL written over time by + VACUUM, across multiple + VACUUM operations against the same table. The + mechanism often prevents future VACUUM + operations from having to write a second FPI + for the same page much later on. In effect, + VACUUM writes slightly more + WAL in the short term with the aim of + ultimately needing to write much less WAL in + the long term. - The sole disadvantage of increasing autovacuum_freeze_max_age - (and vacuum_freeze_table_age along with it) is that - the pg_xact and pg_commit_ts - subdirectories of the database cluster will take more space, because it - must store the commit status and (if track_commit_timestamp is - enabled) timestamp of all transactions back to - the autovacuum_freeze_max_age horizon. The commit status uses - two bits per transaction, so if - autovacuum_freeze_max_age is set to its maximum allowed value - of two billion, pg_xact can be expected to grow to about half - a gigabyte and pg_commit_ts to about 20GB. If this - is trivial compared to your total database size, - setting autovacuum_freeze_max_age to its maximum allowed value - is recommended. Otherwise, set it depending on what you are willing to - allow for pg_xact and pg_commit_ts storage. - (The default, 200 million transactions, translates to about 50MB - of pg_xact storage and about 2GB of pg_commit_ts - storage.) + VACUUM may not be able to freeze every tuple's + xmin in relatively rare cases. The + criteria that determines basic eligibility for freezing is exactly + the same as the one that determines if a deleted tuple should be + considered removable or merely dead + but not yet removable (namely, the XID-based + removable cutoff). In extreme cases a + long-running transaction can hold back every + VACUUM's removable cutoff for so long that the + system is eventually forced to activate xidStopLimit mode + protections. - - One disadvantage of decreasing vacuum_freeze_min_age is that - it might cause VACUUM to do useless work: freezing a row - version is a waste of time if the row is modified - soon thereafter (causing it to acquire a new XID). So the setting should - be large enough that rows are not frozen until they are unlikely to change - any more. - + + <command>VACUUM</command>'s aggressive strategy - - To track the age of the oldest unfrozen XIDs in a database, - VACUUM stores XID - statistics in the system tables pg_class and - pg_database. In particular, - the relfrozenxid column of a table's - pg_class row contains the oldest remaining unfrozen - XID at the end of the most recent VACUUM that successfully - advanced relfrozenxid (typically the most recent - aggressive VACUUM). Similarly, the - datfrozenxid column of a database's - pg_database row is a lower bound on the unfrozen XIDs - appearing in that database — it is just the minimum of the - per-table relfrozenxid values within the database. - A convenient way to - examine this information is to execute queries such as: + + transaction ID + wraparound + + + + wraparound + of transaction IDs and multixact IDs + + + + As already noted briefly in the introductory section, freezing + doesn't just allow queries to avoid lookups of subsidiary + transaction status information in structures such as + pg_xact. Freezing also plays a crucial role + in enabling management of the XID address space by + VACUUM. + + + + VACUUM maintains information about the oldest + unfrozen XID that remains in the table when it uses its + aggressive strategy. This information is + stored in the pg_class system table at + the end of each aggressive VACUUM: the table + processed by aggressive VACUUM has its + pg_class.relfrozenxid + updated (relfrozenxid + advances by a certain number of XIDs). Similarly, + the datfrozenxid column of a + database's pg_database row is a lower + bound on the unfrozen XIDs appearing in that database — it + is just the minimum of the per-table + relfrozenxid values within the + database. The system also maintains + pg_class.relminmxid and + pg_database.datminmxid + fields to track the oldest MultiXact ID, while following + analogous rules. + + + + + When the VACUUM command's + VERBOSE parameter is specified, + VACUUM prints various statistics about the + table. This includes information about how + relfrozenxid and + relminmxid advanced, and the number + of newly frozen pages. The same details appear in the server + log when autovacuum logging (controlled by ) reports on a + VACUUM operation executed by autovacuum. + + + + + This process is intended to reliably prevent the entire database + from ever having a transaction ID that is excessively far in the + past. The maximum distance that the system can + tolerate between the oldest unfrozen transaction ID and the next + (unallocated) transaction ID is about 2.1 billion transaction + IDs. That is an upper limit; the greatest + age(relfrozenxid)/age(datfrozenxid) + in the system should ideally never exceed a fraction of that + upper limit. If that upper limit is ever reached, then the + system will activate xidStopLimit mode + protections. These protections will remain in force + until VACUUM (typically autovacuum) manages to + advance the oldest datfrozenxid in the cluster + (by advancing that database's oldest relfrozenxid via an + aggressive VACUUM). + + + + The 2.1 billion XIDs distance invariant is a + consequence of the fact that on-disk storage of transaction IDs + in heap row headers uses a truncated 32-bit representation to + save space (rather than the full 64-bit representation). Since + all unfrozen transaction IDs from heap tuple headers + must be from the same transaction ID epoch + (which is what the invariant actually assures), there isn't any + need to store a separate epoch field in each tuple header. The + downside is that the system depends on freezing (and + relfrozenxid advancement during + aggressive VACUUMs) to make sure that the + available supply of transaction IDs never exceeds + the demand. + + + + + In practice most tables require periodic aggressive vacuuming. + However, some individual non-aggressive + VACUUM operations may be able to advance + relfrozenxid and/or + relminmxid. This is most common in + small, frequently modified tables, where + VACUUM happens to scan all pages (or at least + all pages not marked all-frozen in the visibility map) in the + course of removing dead tuples. + + + + + VACUUM/autovacuum also use and settings as + independent Multixact ID orientated controls for aggressive mode + VACUUM and anti-wraparound autovacuum. + These work analogously to the XID-based + vacuum_freeze_table_age and + autovacuum_freeze_max_age, respectively. + Note, however, that if the multixacts members storage + area exceeds 2GB, then the effective value of + autovacuum_multixact_freeze_max_age will be + lower, resulting in more frequent aggressive mode VACUUMs. + + + + There is only one major runtime behavioral differences between + aggressive mode VACUUM and non-aggressive + (standard) VACUUM. Both kinds of + VACUUM use the visibility map to determine which + pages of a table must be scanned, and which can be skipped. + However, only non-aggressive VACUUM will skip + pages that don't have any dead row versions even if those pages + might still have row versions with old XID values; aggressive + VACUUMs are limited to skipping pages already + marked all-frozen (and all-visible). + + + + As a consequence of all this, non-aggressive + VACUUMs usually won't freeze + every page with an old row version in the + table. Most individual tables will eventually need an aggressive + VACUUM, which will reliably freeze all pages + with XID and MXID values older than + vacuum_freeze_min_age, including those from + all-visible but not all-frozen pages (and then update + pg_class). controls when + VACUUM must use its aggressive strategy. + Since the setting is applied against + age(relfrozenxid), settings like + vacuum_freeze_min_age may influence the exact + cadence of aggressive vacuuming. Setting + vacuum_freeze_table_age to 0 forces + VACUUM to always use its aggressive strategy. + + + + + Aggressive VACUUMs apply the same rules for + freezing as non-aggressive VACUUMs. You may + nevertheless notice that aggressive VACUUMs + perform a disproportionately large amount of the total required + freezing in larger tables. + + + This is an indirect consequence of the fact that non-aggressive + VACUUMs won't scan pages that are marked + all-visible but not also marked all-frozen in the visibility + map. VACUUM can only consider freezing those + pages that it actually gets to scan. + + + Note in particular that vacuum_freeze_min_age + isn't very likely to trigger freezing in non-aggressive + VACUUMs, at least with default settings. The + freeze on an FPI write + mechanism is somewhat more likely to trigger in non-aggressive + VACUUMs in practice, though. Much depends on + workload characteristics. + + + + + To ensure that every table has its + relfrozenxid advanced at somewhat + regular intervals, including totally static tables, autovacuum is + invoked on any table that might contain unfrozen rows with XIDs + older than the age specified by the configuration parameter . This will happen + even if autovacuum is disabled. + + + + + In practice all anti-wraparound autovacuums will use + VACUUM's aggressive strategy. This is assured + because the effective value of + vacuum_freeze_table_age is + clamped to a value no greater than 95% of the + current value of autovacuum_freeze_max_age. + As a rule of thumb, vacuum_freeze_table_age + should be set to a value somewhat below + autovacuum_freeze_max_age, leaving enough gap + so that a regularly scheduled VACUUM or an + autovacuum triggered by inserts, updates and deletes is run in + that window. Anti-wraparound autovacuums can be avoided + altogether in tables that reliably receive + some VACUUMs that use the + aggressive strategy. + + + + A convenient way to examine information about + relfrozenxid and + relminmxid is to execute queries such as: SELECT c.oid::regclass as table_name, - greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as age +greatest(age(c.relfrozenxid), + age(t.relfrozenxid)) as xid_age, +mxid_age(c.relminmxid) FROM pg_class c LEFT JOIN pg_class t ON c.reltoastrelid = t.oid WHERE c.relkind IN ('r', 'm'); -SELECT datname, age(datfrozenxid) FROM pg_database; +SELECT datname, +age(datfrozenxid) as xid_age, +mxid_age(datminmxid) +FROM pg_database; - The age column measures the number of transactions from the - cutoff XID to the current transaction's XID. - - - - - When the VACUUM command's VERBOSE - parameter is specified, VACUUM prints various - statistics about the table. This includes information about how - relfrozenxid and - relminmxid advanced, and the number of - newly frozen pages. The same details appear in the server log when - autovacuum logging (controlled by ) reports on a - VACUUM operation executed by autovacuum. + The age column measures the number of + transactions from the cutoff XID to the next unallocated + transactions ID. The mxid_age column + measures the number of MultiXactIds from the cutoff MultiXactId + to the next unallocated multixact ID. - + - - VACUUM normally only scans pages that have been modified - since the last vacuum, but relfrozenxid can only be - advanced when every page of the table - that might contain unfrozen XIDs is scanned. This happens when - relfrozenxid is more than - vacuum_freeze_table_age transactions old, when - VACUUM's FREEZE option is used, or when all - pages that are not already all-frozen happen to - require vacuuming to remove dead row versions. When VACUUM - scans every page in the table that is not already all-frozen, it should - set age(relfrozenxid) to a value just a little more than the - vacuum_freeze_min_age setting - that was used (more by the number of transactions started since the - VACUUM started). VACUUM - will set relfrozenxid to the oldest XID - that remains in the table, so it's possible that the final value - will be much more recent than strictly required. - If no relfrozenxid-advancing - VACUUM is issued on the table until - autovacuum_freeze_max_age is reached, an autovacuum will soon - be forced for the table. - - - - If for some reason autovacuum fails to clear old XIDs from a table, the - system will begin to emit warning messages like this when the database's - oldest XIDs reach forty million transactions from the wraparound point: + + <literal>xidStopLimit</literal> mode + + If for some reason autovacuum utterly fails to advance any + table's relfrozenxid or + relminmxid for an extended period, and + if XIDs and/or MultiXactIds continue to be allocated, the system + will begin to emit warning messages like this when the database's + oldest XIDs reach forty million transactions from the wraparound + point: WARNING: database "mydb" must be vacuumed within 39985967 transactions HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database. - (A manual VACUUM should fix the problem, as suggested by the - hint; but note that the VACUUM must be performed by a - superuser, else it will fail to process system catalogs and thus not - be able to advance the database's datfrozenxid.) - If these warnings are - ignored, the system will shut down and refuse to start any new - transactions once there are fewer than three million transactions left - until wraparound: + (A manual VACUUM should fix the problem, as suggested by the + hint; but note that the VACUUM must be performed by a + superuser, else it will fail to process system catalogs and thus not + be able to advance the database's datfrozenxid.) + If these warnings are ignored, the system will eventually refuse + to start any new transactions. This happens at the point that + there are fewer than three million transactions left: ERROR: database is not accepting commands to avoid wraparound data loss in database "mydb" HINT: Stop the postmaster and vacuum that database in single-user mode. - The three-million-transaction safety margin exists to let the - administrator recover without data loss, by manually executing the - required VACUUM commands. However, since the system will not - execute commands once it has gone into the safety shutdown mode, - the only way to do this is to stop the server and start the server in single-user - mode to execute VACUUM. The shutdown mode is not enforced - in single-user mode. See the reference - page for details about using single-user mode. - - - - Multixacts and Wraparound - - - MultiXactId - - - - wraparound - of multixact IDs - - - - Multixact IDs are used to support row locking by - multiple transactions. Since there is only limited space in a tuple - header to store lock information, that information is encoded as - a multiple transaction ID, or multixact ID for short, - whenever there is more than one transaction concurrently locking a - row. Information about which transaction IDs are included in any - particular multixact ID is stored separately in - the pg_multixact subdirectory, and only the multixact ID - appears in the xmax field in the tuple header. - Like transaction IDs, multixact IDs are implemented as a - 32-bit counter and corresponding storage, all of which requires - careful aging management, storage cleanup, and wraparound handling. - There is a separate storage area which holds the list of members in - each multixact, which also uses a 32-bit counter and which must also - be managed. + The three-million-transaction safety margin exists to let the + administrator recover without data loss, by manually executing the + required VACUUM commands. However, since the system will not + execute commands once it has gone into the safety shutdown mode, + the only way to do this is to stop the server and start the server in single-user + mode to execute VACUUM. The shutdown mode is not enforced + in single-user mode. See the reference + page for details about using single-user mode. - Whenever VACUUM scans any part of a table, it will replace - any multixact ID it encounters which is older than - - by a different value, which can be the zero value, a single - transaction ID, or a newer multixact ID. For each table, - pg_class.relminmxid stores the oldest - possible multixact ID still appearing in any tuple of that table. - If this value is older than - , an aggressive - vacuum is forced. As discussed in the previous section, an aggressive - vacuum means that only those pages which are known to be all-frozen will - be skipped. mxid_age() can be used on - pg_class.relminmxid to find its age. + Anything that influences when and how + relfrozenxid and + relminmxid advance will also directly + affect the high watermark storage overhead from storing a great + deal of historical transaction status information. The + additional space + overhead is usually of fairly minimal concern. It is + noted as an additional downside of allowing the system to get + close to xidStopLimit for the sake of + completeness. - - Aggressive VACUUMs, regardless of what causes - them, are guaranteed to be able to advance - the table's relminmxid. - Eventually, as all tables in all databases are scanned and their - oldest multixact values are advanced, on-disk storage for older - multixacts can be removed. - + + Historical Note + + The term wraparound is inaccurate. Also, there + is no data loss here — the message is + simply wrong. + + + XXX: We really need to fix the situation with single user mode + to put things on a good footing. + + - As a safety device, an aggressive vacuum scan will - occur for any table whose multixact-age is greater than . Also, if the - storage occupied by multixacts members exceeds 2GB, aggressive vacuum - scans will occur more often for all tables, starting with those that - have the oldest multixact-age. Both of these kinds of aggressive - scans will occur even if autovacuum is nominally disabled. + In emergencies, VACUUM will take extraordinary + measures to avoid xidStopLimit mode. A + failsafe mechanism is triggered when thresholds controlled by + and are reached. The + failsafe prioritizes advancing + relfrozenxid and/or + relminmxid as quickly as possible. + Once the failsafe triggers, VACUUM bypasses + all remaining non-essential maintenance tasks, and stops applying + any cost-based delay that was in effect. Any Buffer Access + Strategy in use will also be disabled. @@ -766,6 +960,58 @@ HINT: Stop the postmaster and vacuum that database in single-user mode. + + Truncating transaction status information + + As noted in , anything that + influences when and how relfrozenxid and + relminmxid advance will also directly + affect the high watermark storage overhead needed to store + historical transaction status information. For example, + increasing autovacuum_freeze_max_age (and + vacuum_freeze_table_age along with it) will + make the pg_xact and + pg_commit_ts subdirectories of the database + cluster take more space, because they store the commit status and + (if track_commit_timestamp is enabled) + timestamp of all transactions back to the + datfrozenxid horizon (the earliest + datfrozenxid in the entire cluster). + + + The commit status uses two bits per transaction. The default + autovacuum_freeze_max_age setting of 200 + million transactions translates to about 50MB of + pg_xact storage. When + track_commit_timestamp is enabled, about 2GB of + pg_commit_ts storage will also be required. + + + MultiXactId status information is implemented as two separate + SLRU storage areas: + pg_multixact/members, and + pg_multixact/offsets. There is no simple + formula to determine the storage overhead per MultiXactId, since + MultiXactIds have a variable number of member XIDs. + + + Truncating of transaction status information is only possible at + the end of VACUUMs that advance + relfrozenxid (in the case of + pg_xact and + pg_commit_ts) or + relminmxid (in the case of + (pg_multixact/members and + pg_multixact/offsets) of whatever table + happened to have the oldest value in the cluster when the + VACUUM began. This typically happens very + infrequently, often during aggressive strategy + VACUUMs of one of the database's largest + tables. + + + Updating Planner Statistics @@ -881,7 +1127,7 @@ HINT: Stop the postmaster and vacuum that database in single-user mode. - + diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml index 10ef699fa..8aa332fcf 100644 --- a/doc/src/sgml/ref/create_table.sgml +++ b/doc/src/sgml/ref/create_table.sgml @@ -1515,7 +1515,7 @@ WITH ( MODULUS numeric_literal, REM and/or ANALYZE operations on this table following the rules discussed in . If false, this table will not be autovacuumed, except to prevent - transaction ID wraparound. See for + transaction ID wraparound. See for more about wraparound prevention. Note that the autovacuum daemon does not run at all (except to prevent transaction ID wraparound) if the diff --git a/doc/src/sgml/ref/prepare_transaction.sgml b/doc/src/sgml/ref/prepare_transaction.sgml index f4f6118ac..ede50d6f7 100644 --- a/doc/src/sgml/ref/prepare_transaction.sgml +++ b/doc/src/sgml/ref/prepare_transaction.sgml @@ -128,7 +128,7 @@ PREPARE TRANSACTION transaction_id This will interfere with the ability of VACUUM to reclaim storage, and in extreme cases could cause the database to shut down to prevent transaction ID wraparound (see ). Keep in mind also that the transaction + linkend="freezing-xid-space"/>). Keep in mind also that the transaction continues to hold whatever locks it held. The intended usage of the feature is that a prepared transaction will normally be committed or rolled back as soon as an external transaction manager has verified that diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml index 57bc4c23e..0c28604a6 100644 --- a/doc/src/sgml/ref/vacuum.sgml +++ b/doc/src/sgml/ref/vacuum.sgml @@ -123,7 +123,9 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ aggressive strategy. Specifying FREEZE is equivalent to performing VACUUM with the and @@ -219,7 +221,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ ). However, the + (see ). However, the wraparound failsafe mechanism controlled by will generally trigger automatically to avoid transaction ID wraparound failure, and diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml index da2393783..b61d523c2 100644 --- a/doc/src/sgml/ref/vacuumdb.sgml +++ b/doc/src/sgml/ref/vacuumdb.sgml @@ -233,7 +233,7 @@ PostgreSQL documentation ID age of at least mxid_age. This setting is useful for prioritizing tables to process to prevent multixact ID wraparound (see - ). + ). For the purposes of this option, the multixact ID age of a relation is @@ -254,7 +254,7 @@ PostgreSQL documentation transaction ID age of at least xid_age. This setting is useful for prioritizing tables to process to prevent transaction - ID wraparound (see ). + ID wraparound (see ). For the purposes of this option, the transaction ID age of a relation diff --git a/doc/src/sgml/xact.sgml b/doc/src/sgml/xact.sgml index b467660ee..e18ad8fd3 100644 --- a/doc/src/sgml/xact.sgml +++ b/doc/src/sgml/xact.sgml @@ -49,7 +49,7 @@ The internal transaction ID type xid is 32 bits wide - and wraps around every + and wraps around every 4 billion transactions. A 32-bit epoch is incremented during each wraparound. There is also a 64-bit type xid8 which includes this epoch and therefore does not wrap around during the @@ -100,7 +100,7 @@ rows and can be inspected using the extension. Row-level read locks might also require the assignment of multixact IDs (mxid; see ). + linkend="freezing-xid-space"/>). -- 2.40.0