Re: New strategies for freezing, advancing relfrozenxid early - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: New strategies for freezing, advancing relfrozenxid early |
Date | |
Msg-id | CAH2-WzkbuGaW61LzAfj=Ge7YcEBYmgyZ6dwTdVDYXGXPt3c2pQ@mail.gmail.com Whole thread Raw |
In response to | Re: New strategies for freezing, advancing relfrozenxid early (Andres Freund <andres@anarazel.de>) |
Responses |
Re: New strategies for freezing, advancing relfrozenxid early
Re: New strategies for freezing, advancing relfrozenxid early |
List | pgsql-hackers |
On Thu, Jan 26, 2023 at 9:53 AM Andres Freund <andres@anarazel.de> wrote: > I assume the case you're thinking of is that pruning did *not* do any changes, > but in the process of figuring out that nothing needed to be pruned, we did a > MarkBufferDirtyHint(), and as part of that emitted an FPI? Yes. > > That's going to be very significantly more aggressive. For example > > it'll impact small tables very differently. > > Maybe it would be too aggressive, not sure. The cost of a freeze WAL record is > relatively small, with one important exception below, if we are 99.99% sure > that it's not going to require an FPI and isn't going to dirty the page. > > The exception is that a newer LSN on the page can cause the ringbuffer > replacement to trigger more more aggressive WAL flushing. No meaningful > difference if we modified the page during pruning, or if the page was already > in s_b (since it likely won't be written out via the ringbuffer in that case), > but if checksums are off and we just hint-dirtied the page, it could be a > significant issue. Most of the overhead of FREEZE WAL records (with freeze plan deduplication and page-level freezing in) is generic WAL record header overhead. Your recent adversarial test case is going to choke on that, too. At least if you set checkpoint_timeout to 1 minute again. > Thus a modification of the above logic could be to opportunistically freeze if > a ) it won't cause an FPI and either > b1) the page was already dirty before pruning, as we'll not do a ringbuffer > replacement in that case > or > b2) We wrote a WAL record during pruning, as the difference in flush position > is marginal > > An even more aggressive version would be to replace b1) with logic that'd > allow newly dirtying the page if it wasn't read through the ringbuffer. But > newly dirtying the page feels like it'd be more dangerous. In many cases we'll have to dirty the page anyway, just to set PD_ALL_VISIBLE. The whole way the logic works is conditioned (whether triggered by an FPI or triggered by my now-reverted GUC) on being able to set the whole page all-frozen in the VM. > A less aggressive version would be to check if any WAL records were emitted > during heap_page_prune() (instead of FPIs) and whether we'd emit an FPI if we > modified the page again. Similar to what we do now, except not requiring an > FPI to have been emitted. Also way more aggressive. Not nearly enough on its own. > But to me it seems a bit odd that VACUUM now is more aggressive if checksums / > wal_log_hint bits is on, than without them. Which I think is how using either > of pgWalUsage.wal_fpi, pgWalUsage.wal_records ends up working? Which part is the odd part? Is it odd that page-level freezing works that way, or is it odd that page-level checksums work that way? In any case this seems like an odd thing for you to say, having eviscerated a patch that really just made the same behavior trigger independently of FPIs in some tables, controlled via a GUC. -- Peter Geoghegan
pgsql-hackers by date: