Re: Berserk Autovacuum (let's save next Mandrill) - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Berserk Autovacuum (let's save next Mandrill) |
Date | |
Msg-id | 20200313213851.ejrk5gptnmp65uoo@alap3.anarazel.de Whole thread Raw |
In response to | Re: Berserk Autovacuum (let's save next Mandrill) (Justin Pryzby <pryzby@telsasoft.com>) |
Responses |
Re: Berserk Autovacuum (let's save next Mandrill)
Re: Berserk Autovacuum (let's save next Mandrill) Re: Berserk Autovacuum (let's save next Mandrill) |
List | pgsql-hackers |
Hi, On 2020-03-13 13:44:42 -0500, Justin Pryzby wrote: > As I understand, the initial motivation of this patch was to avoid disruptive > anti-wraparound vacuums on insert-only table. But if vacuum were triggered at > all, it would freeze the oldest tuples, which is all that's needed; especially > since fd31cd2651 "Don't vacuum all-frozen pages.", those pages would never need > to be vacuumed again. Recently written tuples wouldn't be frozen, which is ok, > they're handled next time. > > Another motivation of the patch is to allow indexonly scan, for which the > planner looks at pages' "relallvisible" fraction (and at execution if a page > isn't allvisible, visits the heap). Again, that happens if vacuum were run at > all. Again, some pages won't be marked allvisible, which is fine, they're > handled next time. > > I think freeze_min_age=0 could negatively affect people who have insert-mostly > tables (I'm not concerned, but that includes us). If they consistently hit the > autovacuum insert threshold before the cleanup threshold for updated/deleted > tuples, any updated/deleted tuples would be frozen, which would be > wasteful: I think that's a valid concern. > |One disadvantage of decreasing vacuum_freeze_min_age is that it might cause > |VACUUM to do useless work: freezing a row version is a waste of time if the row > |is modified soon thereafter (causing it to acquire a new XID). So the setting > |should be large enough that rows are not frozen until they are unlikely to > |change any more. I think the overhead here might be a bit overstated. Once a page is dirtied (or already dirty) during vacuum, and we freeze a single row (necessating WAL logging), there's not really a good reason to not also freeze the rest of the row on that page. The added cost for freezing another row is miniscule compared to the "constant" cost of freezing anything on the page. It's of course different if there are otherwise no tuples worth freezing on the page (not uncommon). But there's really no reason for that to be the case: Afaict the only problem with more aggressively freezing when we touch (beyond hint bits) the page anyway is that we commonly end up with multiple WAL records for the same page: 1) lazy_scan_heap()->heap_page_prune() will log a XLOG_HEAP2_CLEAN record, but leave itemids in place most of the time 2) lazy_scan_heap()->log_heap_freeze() will log a XLOG_HEAP2_FREEZE_PAGE record 3a) if no indexes exist/index cleanup is disabled: lazy_vacuum_page()->lazy_vacuum_page() will log a XLOG_HEAP2_CLEAN record, removing dead tuples (including itemids) 3b) if indexes need to be cleaned up, lazy_vacuum_heap()->lazy_vacuum_page() will log a XLOG_HEAP2_CLEAN which is not nice. It likely is worth merging xl_heap_freeze_page into xl_heap_clean, and having heap pruning always freeze once it decides to dirty a page. We could probably always prune dead tuples as part of heap_prune_chain() if there's no indexes - but I'm doubtful it's worth it, since there'll be few tables with lots of dead tuples that don't have indexes. Merging 3b's WAL record would be harder, I think. There's also a significant source of additional WAL records here, one that I think should really not have been introduced: 4) HeapTupleSatisfiesVacuum() called both by heap_prune_chain(), and lazy_scan_heap() will often trigger a WAL record when the checksums or wal_log_hint_bits are enabled. If the page hasn't been modified in the current checkpoint window (extremely common for VACUUM, reasonably common for opportunistic pruning), we will log a full page write. Imo this really should have been avoided when checksums were added, that's a pretty substantial and unnecessary increase in overhead. It's probably overkill to tie fixing the 'insert only' case to improving the WAL logging for vacuuming / pruning. But it'd certainly would largely remove the tradeoff discussed here, by removing additional overhead of freezing in tables that are also updated. > Also, there was a discussion about index cleanup with the conclusion that it > was safer not to skip it, since otherwise indexes might bloat. I think that's > right, since vacuum for cleanup is triggered by the number of dead heap tuples. > To skip index cleanup, I think you'd want a metric for > n_dead_since_index_cleanup. (Or maybe analyze could track dead index tuples > and trigger vacuum of each index separately). > > Having now played with the patch, I'll suggest that 10000000 is too high a > threshold. If autovacuum runs without FREEZE, I don't see why it couldn't be > much lower (100000?) or use (0.2 * n_ins + 50) like the other autovacuum GUC. ISTM that the danger of regressing workloads due to suddenly repeatedly scanning huge indexes that previously were never / rarely scanned is significant (if there's a few dead tuples, otherwise most indexes will be able to skip the scan since the vacuum_cleanup_index_scale_factor introduction)). Greetings, Andres Freund
pgsql-hackers by date: