Re: New strategies for freezing, advancing relfrozenxid early - Mailing list pgsql-hackers
From | Matthias van de Meent |
---|---|
Subject | Re: New strategies for freezing, advancing relfrozenxid early |
Date | |
Msg-id | CAEze2WjuQ3xpGAd7+VwsmqDU3RjFMcGZP3WRRFSiA9Yn=pxsrw@mail.gmail.com Whole thread Raw |
In response to | Re: New strategies for freezing, advancing relfrozenxid early (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: New strategies for freezing, advancing relfrozenxid early
Re: New strategies for freezing, advancing relfrozenxid early |
List | pgsql-hackers |
On Tue, 3 Jan 2023 at 21:30, Peter Geoghegan <pg@bowt.ie> wrote: > > Attached is v14. Some reviews (untested; only code review so far) on these versions of the patches: > [PATCH v14 1/3] Add eager and lazy freezing strategies to VACUUM. > + /* > + * Threshold cutoff point (expressed in # of physical heap rel blocks in > + * rel's main fork) that triggers VACUUM's eager freezing strategy > + */ I don't think the mention of 'cutoff point' is necessary when it has 'Threshold'. > + int freeze_strategy_threshold; /* threshold to use eager > [...] > + BlockNumber freeze_strategy_threshold; Is there a way to disable the 'eager' freezing strategy? `int` cannot hold the maximum BlockNumber... > + lazy_scan_strategy(vacrel); > if (verbose) I'm slightly suprised you didn't update the message for verbose vacuum to indicate whether we used the eager strategy: there are several GUCs for tuning this behaviour, so you'd expect to want direct confirmation that the configuration is effective. (looks at further patches) I see that the message for verbose vacuum sees significant changes in patch 2 instead. --- > [PATCH v14 2/3] Add eager and lazy VM strategies to VACUUM. General comments: I don't see anything regarding scan synchronization in the vmsnap scan system. I understand that VACUUM is a load that is significantly different from normal SEQSCANs, but are there good reasons to _not_ synchronize the start of VACUUM? Right now, we don't use syncscan to determine a startpoint. I can't find the reason why in the available documentation: [0] discusses the issue but without clearly describing an issue why it wouldn't be interesting from a 'nothing lost' perspective. In addition, I noticed that progress reporting of blocks scanned ("heap_blocks_scanned", duh) includes skipped pages. Now that we have a solid grasp of how many blocks we're planning to scan, we can update the reported stats to how many blocks we're planning to scan (and have scanned), increasing the user value of that progress view. [0] https://www.postgresql.org/message-id/flat/19398.1212328662%40sss.pgh.pa.us#17b2feb0fde6a41779290632d8c70ef1 > + double tableagefrac; I think this can use some extra info on the field itself, that it is the fraction of how "old" the relfrozenxid and relminmxid fields are, as a fraction between 0 (latest values; nextXID and nextMXID), and 1 (values that are old by at least freeze_table_age and multixact_freeze_table_age (multi)transaction ids, respectively). > -#define VACOPT_DISABLE_PAGE_SKIPPING 0x80 /* don't skip any pages */ > +#define VACOPT_DISABLE_PAGE_SKIPPING 0x80 /* don't skip using VM */ I'm not super happy with this change. I don't think we should touch the VM using snapshots _at all_ when disable_page_skipping is set: > + * Decide vmsnap scanning strategy. > * > - * This test also enables more frequent relfrozenxid advancement during > - * non-aggressive VACUUMs. If the range has any all-visible pages then > - * skipping makes updating relfrozenxid unsafe, which is a real downside. > + * First acquire a visibility map snapshot, which determines the number of > + * pages that each vmsnap scanning strategy is required to scan for us in > + * passing. I think we should not take disk-backed vm snapshots when force_scan_all is set. We need VACUUM to be able to run on very resource-constrained environments, and this does not do that - it adds a disk space requirement for the VM snapshot for all but the smallest relation sizes, which is bad when you realize that we need VACUUM when we want to clean up things like CLOG. Additionally, it took me several reads of the code and comments to understand what the decision-making process for lazy vs eager is, and why. The comments are interspersed with the code, with no single place that describes it from a bird's eyes' view. I think something like the following would be appreciated by other readers of the code: + We determine whether we choose the eager or lazy scanning strategy based on how many extra pages the eager strategy would take over the lazy strategy, and how "old" the table is (as determined in tableagefrac): + When a table is still "young" (tableagefrac < TABLEAGEFRAC_MIDPOINT), the eager strategy is accepted if we need to scan 5% (MAX_PAGES_YOUNG_TABLEAGE) more of the table. + As the table gets "older" (tableagefrac between MIDPOINT and HIGHPOINT), the threshold for eager scanning is relaxed linearly from this 5% to 70% (MAX_PAGES_OLD_TABLEAGE) of the table being scanned extra (over what would be scanned by the lazy strategy). + Once the tableagefrac passes HIGHPOINT, we stop considering the lazy strategy, and always eagerly scan the table. > @@ -1885,6 +1902,30 @@ retry: > tuples_frozen = 0; /* avoid miscounts in instrumentation */ > } > > /* > + * There should never be dead or deleted tuples when PD_ALL_VISIBLE is > + * set. Check that here in passing. > + * > [...] I'm not sure this patch is the appropriate place for this added check. I don't disagree with the change, I just think that it's unrelated to the rest of the patch. Same with some of the changes in lazy_scan_heap. > +vm_snap_stage_blocks Doesn't this waste a lot of cycles on skipping frozen blocks if most of the relation is frozen? I'd expected something more like a byte- or word-wise processing of skippable blocks, as opposed to this per-block loop. I don't think it's strictly necessary to patch, but I think it would be a very useful addition for those with larger tables. > + XIDFrac = (double) (nextXID - cutoffs->relfrozenxid) / > + ((double) freeze_table_age + 0.5); I don't quite understand what this `+ 0.5` is used for, could you explain? > + [...] Freezing and advancing > + <structname>pg_class</structname>.<structfield>relfrozenxid</structfield> > + now take place more proactively, in every > + <command>VACUUM</command> operation. This claim that it happens more proactively in "every" VACUUM operation is false, so I think the removal of "every" would be better. --- > [PATCH v14 3/3] Finish removing aggressive mode VACUUM. I've not completed a review for this patch - I'll continue on that tomorrow - but here's a first look: I don't quite enjoy the refactoring+rewriting of the docs section; it's difficult to determine what changed when so many things changed line lengths and were moved around. Tomorrow I'll take a closer look, but a separation of changes vs moved would be useful for review. > + /* > + * Earliest permissible NewRelfrozenXid/NewRelminMxid values that can be > + * set in pg_class at the end of VACUUM. > + */ > + TransactionId MinXid; > + MultiXactId MinMulti; I don't quite like this wording, but I'm not sure what would be better. > + cutoffs->MinXid = nextXID - (freeze_table_age * 0.95); > [...] > + cutoffs->MinMulti = nextMXID - (multixact_freeze_table_age * 0.95); Why are these values adjusted down (up?) by 5%? If I configure this GUC, I'd expect this to be used effectively verbatim; not adjusted by an arbitrary factor. --- That's it for now; thanks for working on this, Kind regards, Matthias van de Meent
pgsql-hackers by date: