Re: Fix some inconsistencies with open-coded visibilitymap_set() callers - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Fix some inconsistencies with open-coded visibilitymap_set() callers
Date
Msg-id CA+TgmoYVx-JvCFsGLRaNomXHv14i3A9KXgKGeMWYyunHNS=66g@mail.gmail.com
Whole thread Raw
List pgsql-hackers
On Thu, Jun 26, 2025 at 3:56 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> Here is a rebased version of this (it had some conflicts with recent commits).

One general complaint is that the commit message complains about the
status quo but isn't real clear about what the patch is actually
fixing.

I'm pretty concerned about this change:

         /*
          * If the page is all visible, need to clear that, unless we're only
          * going to add further frozen rows to it.
          *
          * If we're only adding already frozen rows to a previously empty
          * page, mark it as all-visible.
          */
         if (PageIsAllVisible(page) && !(options & HEAP_INSERT_FROZEN))
         {
             all_visible_cleared = true;
             PageClearAllVisible(page);
             visibilitymap_clear(relation,
                                 BufferGetBlockNumber(buffer),
                                 vmbuffer, VISIBILITYMAP_VALID_BITS);
         }
-        else if (all_frozen_set)
-            PageSetAllVisible(page);

One problem is that this falsifies the comment -- the second sentence
in the comment refers to code that the patch deletes. But also, the
all_frozen_set flag is used a bit further down when setting
xlrec->flags, and now we'll set XLH_INSERT_ALL_FROZEN_SET when in fact
we didn't set the all-frozen bit. It seems like the intention here was
that the existing XLOG_HEAP2_MULTI_INSERT record should cover setting
the all-frozen bit, if we do that, and it seems like you're changing
that somehow, but it's not clear to me why we'd want to change that,
and even if we do, and it doesn't appear to me that you've chased down
all the loose ends i.e. if that's the plan, then I'd expect the redo
routine to be modified and XLH_INSERT_ALL_FROZEN_SET to be deleted.

I think this might actually be an underlying design problem with the
patch -- heap_page_set_vm_and_log() seems to want to be in charge of
both what we do with the heap page and what we do with the
corresponding VM bit, but that runs contrary to the caller's need to
bundle the VM bit changes with some WAL record that is doing other
things to the heap page.

heap_page_set_vm_and_log() says that it shouldn't be called during
recovery but doesn't assert that it isn't.

-        /*
-         * It should never be the case that the visibility map page is set
-         * while the page-level bit is clear, but the reverse is allowed (if
-         * checksums are not enabled).  Regardless, set both bits so that we
-         * get back in sync.
-         *
-         * NB: If the heap page is all-visible but the VM bit is not set, we
-         * don't need to dirty the heap page.  However, if checksums are
-         * enabled, we do need to make sure that the heap page is dirtied
-         * before passing it to visibilitymap_set(), because it may be logged.
-         * Given that this situation should only happen in rare cases after a
-         * crash, it is not worth optimizing.
-         */
-        PageSetAllVisible(page);
-        MarkBufferDirty(buf);
-        old_vmbits = visibilitymap_set(vacrel->rel, blkno, buf,
-                                       InvalidXLogRecPtr,
-                                       vmbuffer, presult.vm_conflict_horizon,
-                                       flags);
+        old_vmbits = heap_page_set_vm_and_log(vacrel->rel, blkno, buf,
+                                              vmbuffer,
presult.vm_conflict_horizon,
+                                              flags);

Why are we deleting a huge comment here explaining the intent of the
code when, AFAICS, the intent of the new code is the same? Even if the
intent of the new code isn't the same, why no comment?

-                /*
-                 * If data checksums are enabled (or wal_log_hints=on), we
-                 * need to protect the heap page from being torn.
-                 *
-                 * If not, then we must *not* update the heap page's LSN. In
-                 * this case, the FPI for the heap page was omitted from the
-                 * WAL record inserted above, so it would be incorrect to
-                 * update the heap page's LSN.
-                 */
-                if (XLogHintBitIsNeeded())
-                {
-                    Page        heapPage = BufferGetPage(heapBuf);
-
-                    PageSetLSN(heapPage, recptr);
-                }

This is another lengthy explanatory comment that goes away with this
patch, but this time the code goes, too. But why? Presumably, either
(1) this change is wrong, or (2) the preexisting code is wrong, or (3)
the patch is trying to change the contract between visibilitymap_set()
and its callers. If it's (3), then I think there should be more
explanation of that contract change. It is unclear to me why you've
rearranged the header comment for visibilitymap_set() in the way that
you have: some of the material that used to be at the top is now at
the bottom, but it's if anything a little less extensive than before,
and certainly not enough for me to understand why or whether this
change is correct.

+    /*
+     * We must never end up with the VM bit set and the page-level
+     * PD_ALL_VISIBLE bit clear. If that were to occur, a subsequent page
+     * modification would fail to clear the VM bit. Though it is possible for
+     * the page-level bit to be set and the VM bit to be clear if checksums
+     * and wal_log_hints are not enabled.
+     */

The last sentence is not great English and maybe not that relevant,
either. I would suggest deleting the whole thing or deleting the word
"Though" and putting the rest of the sentence in parentheses.

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Jacob Champion
Date:
Subject: Re: [PoC] Federated Authn/z with OAUTHBEARER
Next
From: Jeff Davis
Date:
Subject: Re: Collation & ctype method table, and extension hooks