Home > mailing lists

Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access) - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
Date	September 9 22:26:08
Msg-id	CA+TgmoZef8XqRujP1NN=wJdV4SxOtu7rxRozsyAtaEvuVMZhEw@mail.gmail.com Whole thread Raw
In response to	Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access) (Melanie Plageman <melanieplageman@gmail.com>)
Responses	Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
List	pgsql-hackers

Tree view

On Tue, Sep 9, 2025 at 12:24 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> For heap_xlog_visible() the LSN interlock comment is easier to parse
> because of an earlier comment before reading the heap page:
>
>     /*
>      * Read the heap page, if it still exists. If the heap file has dropped or
>      * truncated later in recovery, we don't need to update the page, but we'd
>      * better still update the visibility map.
>      */
>
> I've gone with the direct copy-paste of the LSN interlock paragraph in
> attached v12. I think referring to the other comment is too confusing
> in context here. However, I also added a line about what could cause
> the LSN interlock -- but above it, so as to retain grep-ability of the
> other comment.

I think that reads a little strangely. I would consolidate: Note that
the heap relation may have been dropped or truncated, leading us to
skip updating the heap block due to the LSN interlock. However, even
in that case, it's still safe to update the visibility map, etc. The
rest of the comment is perhaps a tad more explicit than our usual
practice, but that might be a good thing, because sometimes we're a
little too terse about these critical details.

I just realized that I don't like this:

+ /*
+ * If we're only adding already frozen rows to a previously empty
+ * page, mark it as all-frozen and update the visibility map. We're
+ * already holding a pin on the vmbuffer.
+ */

The thing is, we rarely position a block comment just before an "else
if". There are probably instances, but it's not typical. That's why
the existing comment contains two "if blah then blah" statements of
which you deleted the second -- because it needed to cover both the
"if" and the "else if". An alternative style is to move the comment
down a nesting level and rephrase without the conditional, ie. "We're
only adding frozen rows to a previously empty page, so mark it as
all-frozen etc." But I don't know that I like doing that for one
branch of the "if" and not the other.

The rest of what's now 0001 looks OK to me now, although you might
want to wait for a review from somebody more knowledgeable about this
area.

Some very quick comments on the next few patches -- far from a full review:

0002. Looks boring, probably unobjectionable provided the payoff patch is OK.

0003. What you've done here with xl_heap_prune.flags is kind of
horrifying. The problem is that, while you've added code explaining
that VISIBILITYMAP_ALL_{VISIBLE,FROZEN} are honorary XLHP flags,
nobody who isn't looking directly at that comment is going to
understand the muddling of the two namespaces. I would suggest not
doing this, even if it means defining redundant constants and writing
technically-unnecessary code to translate between them.

0004. It is not clear to me why you need to get
log_heap_prune_and_freeze to do the work here. Why can't
log_newpage_buffer get the job done already?

0005. It looks a little curious that you delete the
identify-corruption logic from the end of the if-nest and add it to
the beginning. Ceteris paribus, you'd expect that to be worse, since
corruption is a rare case.

0006. "to me marked" -> "to be marked".

+                * If the heap page is all-visible but the VM bit is
not set, we don't
+                * need to dirty the heap page.  However, if checksums
are enabled, we
+                * do need to make sure that the heap page is dirtied
before passing
+                * it to visibilitymap_set(), because it may be logged.
                 */
-               PageSetAllVisible(page);
-               MarkBufferDirty(buf);
+               if (!PageIsAllVisible(page) || XLogHintBitIsNeeded())
+               {
+                       PageSetAllVisible(page);
+                       MarkBufferDirty(buf);
+               }

I really hate this. Maybe you're going to argue that it's not the job
of this patch to fix the awfulness here, but surely marking a buffer
dirty in case some other function decides to WAL-log it is a
ridiculous plan.

--
Robert Haas
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Masahiko Sawada
Date: 09 September, 21:53:38
Subject: Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

From: Lukas Fittl
Date: 09 September, 22:35:43
Subject: Re: Stack-based tracking of per-node WAL/buffer usage

Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access) - Mailing list pgsql-hackers

Previous

Next