Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation |
Date | |
Msg-id | 20230113215956.nx2orfnlcixjyqia@awork3.anarazel.de Whole thread Raw |
In response to | Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
|
List | pgsql-hackers |
Hi, On 2023-01-12 16:08:06 -0500, Robert Haas wrote: > Normally, the XID age of a table never reaches autovacuum_freeze_max_age in > the first place. That's not at all my experience. I often see it being the primary reason for autovacuum vacuuming large tables on busy OLTP systems. Even without any longrunning transactions or such, with available autovac workers and without earlier autovacuums getting interrupted by locks. Once a table is large, reasonable scale factors require a lot of changes to accumulate to trigger an autovacuum, and during a vacuum a lot of transactions complete, leading to large tables having a significant age by the time autovac finishes. The most common "bad" reason for reaching autovacuum_freeze_max_age that I see is cost limits not allowing vacuum to complete on time. Perhaps we should track how often autovacuum was triggered by what reason in a relation's pgstats? That'd make it a lot easier to collect data, both for tuning the thresholds on a system, and for hacking on postgres. Tracking the number of times autovacuum was interruped due to a lock request might be a good idea as well? I think it'd be a good idea to split off the part of the patch that introduces AutoVacType / adds logging for what triggered. That's independently useful, likely uncontroversial and makes the remaining patch smaller. I'd also add the trigger to the pg_stat_activity entry for the autovac worker. Additionally I think we should add information about using failsafe mode to the p_s_a entry. I've wished for a non-wraparound, xid age based, "autovacuum trigger" many times, FWIW. And I've seen plenty of places write their own userspace version of it, because without it they run into trouble. However, I don't like that the patch infers the various thresholds using magic constants / multipliers. autovacuum_freeze_max_age is really a fairly random collection of things: 1) triggers autovacuum on tables based on age, in addition to the dead tuple / inserted tuples triggers 2) prevents auto-cancellation of autovacuum 3) starts autovacuum, even if autovacuum is disabled IME hitting 1) isn't a reason for concern, it's perfectly normal. Needing 2) to make progress is a bit more concerning. 3) should rarely be needed, but is a good safety mechanism. I doubt that controlling all of them via one GUC is sensible. If I understand the patch correctly, we now have the following age based thresholds for av: - force-enable autovacuum: oldest_datfrozenxid + autovacuum_freeze_max_age < nextXid - autovacuum based on age: freeze_max_age = Min(autovacuum_freeze_max_age, table_freeze_max_age) tableagevac = relfrozenxid < recentXid - freeze_max_age - prevent auto-cancellation: freeze_max_age = Min(autovacuum_freeze_max_age, table_freeze_max_age) prevent_auto_cancel_age = Min(freeze_max_age * 2, 1 billion) prevent_auto_cancel = reflrozenxid < recentXid - prevent_auto_cancel_age Is that right? One thing I just noticed: Isn't it completely bonkers that we compute recentXid/recentMulti once at the start of a worker in relation_needs_vacanalyze()? That's fine for the calls in do_autovacuum()'s initial loops over all tables. But seems completely wrong for the later calls via table_recheck_autovac() -> recheck_relation_needs_vacanalyze() -> relation_needs_vacanalyze()? These variables really shouldn't be globals. It makes sense to cache them locally in do_autovacuum(), but reusing them recheck_relation_needs_vacanalyze() and sharing it between launcher and worker is bad. Greetings, Andres Freund
pgsql-hackers by date: