Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access) - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access) |
Date | |
Msg-id | CA+TgmobYY2URHKBMh1NHo1zF3Z28TiS_+0aSyRYyBfvauCPZzA@mail.gmail.com Whole thread Raw |
In response to | Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access) (Melanie Plageman <melanieplageman@gmail.com>) |
List | pgsql-hackers |
On Tue, Sep 9, 2025 at 7:08 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > Fair. I've introduced new XLHP flags in attached v13. Hopefully it > puts an end to the horror. I suggest not renumbering all of the existing flags and just adding these new ones at the end. Less code churn and more likely to break in an obvious way if you mix up the two sets of flags. More on 0002: + set_heap_lsn = XLogHintBitIsNeeded() ? true : set_heap_lsn; Maybe just if (XLogHintBitIsNeeded) set_heap_lsn = true? I don't feel super-strongly that what you've done is bad but it looks weird to my eyes. + * If we released any space or line pointers or will be setting a page in + * the visibility map, measure the page's freespace to later update the "setting a page in the visibility map" seems a little muddled to me. You set bits, not pages. + * all-visible (or all-frozen, depending on the vacuum mode,) which is This comma placement is questionable. /* + * Note that the heap relation may have been dropped or truncated, leading + * us to skip updating the heap block due to the LSN interlock. However, + * even in that case, it's still safe to update the visibility map. Any + * WAL record that clears the visibility map bit does so before checking + * the page LSN, so any bits that need to be cleared will still be + * cleared. + * + * Note that the lock on the heap page was dropped above. In normal + * operation this would never be safe because a concurrent query could + * modify the heap page and clear PD_ALL_VISIBLE -- violating the + * invariant that PD_ALL_VISIBLE must be set if the corresponding bit in + * the VM is set. + * + * In recovery, we expect no other writers, so writing to the VM page + * without holding a lock on the heap page is considered safe enough. It + * is done this way when replaying xl_heap_visible records (see */ How many copies of this comment do you plan to end up with? The comment for log_heap_prune_and_freeze seems to be anticipating future work. > > 0004. It is not clear to me why you need to get > > log_heap_prune_and_freeze to do the work here. Why can't > > log_newpage_buffer get the job done already? > > Well, I need something to emit the changes to the VM. I'm eliminating > all users of xl_heap_visible. Empty pages are the ones that benefit > the least from switching from xl_heap_visible -> xl_heap_prune. But, > if I don't transition them, we have to maintain all the > xl_heap_visible code (including visibilitymap_set() in its long form). > > As for log_newpage_buffer(), I could keep it if you think it is too > confusing to change log_heap_prune_and_freeze()'s API (by passing > force_heap_fpi) to handle this case, I can leave log_newpage_buffer() > there and then call log_heap_prune_and_freeze(). > > I just thought it seemed simple to avoid emitting the new page record > and the VM update record, so why not -- but I don't have strong > feelings. Yeah, I'm not sure what the right thing to do here is. I think I was again experiencing brain fade by forgetting that there is a heap page and a VM page and, of course, log_heap_newpage() probably isn't going to touch the latter. So that makes sense. On the other hand, we could only have one type of WAL record for every single operation in the system if we gave it enough flags, and force_heap_fpi seems suspiciously like a flag that turns this into a whole different kind of WAL record. > > 0005. It looks a little curious that you delete the > > identify-corruption logic from the end of the if-nest and add it to > > the beginning. Ceteris paribus, you'd expect that to be worse, since > > corruption is a rare case. > > On master, the two corruption cases are sandwiched between the normal > VM set cases. And I actually think doing it this way is brittle. If > you put the cases which set the VM first, you have to have completely > bulletproof the if statements guarding them to foreclose any possible > corruption case from entering because otherwise you will overwrite the > corruption you then try to detect. Hmm. In the current code, we first test (!all_visible_according_to_vm && presult.all_visible), then (all_visible_according_to_vm && !PageIsAllVisible(page) && visibilitymap_get_status(vacrel->rel, blkno, &vmbuffer) != 0), and then (presult.lpdead_items > 0 && PageIsAllVisible(page)). The first and second can never coexist, because they require opposite values of all_visible_according_to_vm. The second and third cannot coexist because they require opposite values of PageIsAllVisible(page). It is not entirely obvious that the first and third tests couldn't both pass, but you'd have to have presult.all_visible and presult.lpdead_items > 0, and it's a bit hard to see how heap_page_prune_and_freeze() could ever allow that. Consider: if (prstate.all_visible && prstate.lpdead_items == 0) { presult->all_visible = prstate.all_visible; presult->all_frozen = prstate.all_frozen; } else { presult->all_visible = false; presult->all_frozen = false; } ... presult->lpdead_items = prstate.lpdead_items; So I don't really think I'm persuaded that the current way is brittle. But that having been said, I agree with you that the order of the checks is kind of random, and I don't think it really matters that much for performance. What does matter is clarity. I feel like what I'd ideally like this logic to do is say: do we want the VM bit for the page to be set to all-frozen, just all-visible, or neither? Then push the VM bit to the correct state, dragging the page-level bit along behind. And the current logic sort of does that. It's roughly: 1. Should we go from not-all-visible to either all-visible or all-frozen? If yes, do so. 2. Should we go from either all-visible or all-frozen to not-all-visible? If yes, do so. 3. Should we go from either all-visible or all-frozen to not-all-visible for a different reason? If yes, do so. 4. Should we go from all-visible to all-frozen? If yes, do so. But what's weird is that all the tests are written differently, and we have two different reasons for going to not-all-visible, namely PD_ALL_VISIBLE-not-set and dead-items-on-page, whereas there's only one test for each of the other state-transitions, because the decision-making for those cases is fully completed at an earlier stage. I would kind of like to see this expressed in a way that first decides which state transition to make (forward-to-all-frozen, forward-to-all-visible, backward-to-all-visible, backward-to-not-all-visible, nothing) and then does the corresponding work. What you're doing instead is splitting half of those functions off into a helper function while keeping the other half where they are without cleaning up any of the logic. Now, maybe that's OK: I'm far from having grokked the whole patch set. But it is not any more clear than what we have now, IMHO, and perhaps even a bit less so. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: