Thread: Periodic FSM vacuum doesn't happen in one-pass strategy vacuum.

Periodic FSM vacuum doesn't happen in one-pass strategy vacuum.

From
Masahiko Sawada
Date:
Hi,

With commit c120550edb86, If we got the cleanup lock on the page,
lazy_scan_prune() marks dead item IDs directly to LP_UNUSED. So the
following check with has_lpdead_items made the periodic FSM vacuum in
the one-pass strategy vacuum no longer being triggered:

            if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items &&
                blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES)
            {
                FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum,
                                        blkno);
                next_fsm_block_to_vacuum = blkno;
            }

Before c120550edb86, since we marked dead item IDs to LP_DEAD once
even in the one-pass strategy vacuum, we used to call
lazy_vacuum_heap_page() to vacuum the page and to call
FreeSpaceMapVacuum() periodically, so we had that check.

I've attached a patch to fix it.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Periodic FSM vacuum doesn't happen in one-pass strategy vacuum.

From
Melanie Plageman
Date:
On Mon, Mar 31, 2025 at 6:03 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> With commit c120550edb86, If we got the cleanup lock on the page,
> lazy_scan_prune() marks dead item IDs directly to LP_UNUSED. So the
> following check with has_lpdead_items made the periodic FSM vacuum in
> the one-pass strategy vacuum no longer being triggered:
>
>             if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items &&
>                 blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES)
>             {
>                 FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum,
>                                         blkno);
>                 next_fsm_block_to_vacuum = blkno;
>             }
>
> Before c120550edb86, since we marked dead item IDs to LP_DEAD once
> even in the one-pass strategy vacuum, we used to call
> lazy_vacuum_heap_page() to vacuum the page and to call
> FreeSpaceMapVacuum() periodically, so we had that check.

Whoops, yea, I had a feeling that something wasn't right here when I
saw that thread a couple weeks ago about FSM vacuuming taking forever
at the end of a vacuum and then I remember looking at the code and
thinking why is VACUUM_FSM_EVERY_PAGES named that if it only is used
when there are no indexes or a multi-pass vacuum.

I even made a comment in [1] that we would only do FSM_EVERY_PAGES
when we have multi-pass vacuum -- which is even less after 17.

Isn't it ironic that I actually broke it.

> I've attached a patch to fix it.

Looks like you forgot

- Melanie

[1] https://www.postgresql.org/message-id/CAAKRu_aXqoj2Vfqu3yzscsTX%3D2nPQ4y-aadCNz6nJP_o12GyxA%40mail.gmail.com



Re: Periodic FSM vacuum doesn't happen in one-pass strategy vacuum.

From
Masahiko Sawada
Date:
On Mon, Mar 31, 2025 at 3:12 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Mon, Mar 31, 2025 at 6:03 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > With commit c120550edb86, If we got the cleanup lock on the page,
> > lazy_scan_prune() marks dead item IDs directly to LP_UNUSED. So the
> > following check with has_lpdead_items made the periodic FSM vacuum in
> > the one-pass strategy vacuum no longer being triggered:
> >
> >             if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items &&
> >                 blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES)
> >             {
> >                 FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum,
> >                                         blkno);
> >                 next_fsm_block_to_vacuum = blkno;
> >             }
> >
> > Before c120550edb86, since we marked dead item IDs to LP_DEAD once
> > even in the one-pass strategy vacuum, we used to call
> > lazy_vacuum_heap_page() to vacuum the page and to call
> > FreeSpaceMapVacuum() periodically, so we had that check.
>
> Whoops, yea, I had a feeling that something wasn't right here when I
> saw that thread a couple weeks ago about FSM vacuuming taking forever
> at the end of a vacuum and then I remember looking at the code and
> thinking why is VACUUM_FSM_EVERY_PAGES named that if it only is used
> when there are no indexes or a multi-pass vacuum.
>
> I even made a comment in [1] that we would only do FSM_EVERY_PAGES
> when we have multi-pass vacuum -- which is even less after 17.
>
> Isn't it ironic that I actually broke it.
>
> > I've attached a patch to fix it.
>
> Looks like you forgot

Oops, attached now.

Looking the places related to VACUUM_FSM_EVERY_PAGES further, the
comment atop of vacuumlazy.c seems incorrect:

* In between phases, vacuum updates the freespace map (every
* VACUUM_FSM_EVERY_PAGES).

IIUC the context is two-pass strategy vacuum (i.e., the table has
indexes) and VACUUM_FSM_EVERY_PAGES is ignored in this case.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Periodic FSM vacuum doesn't happen in one-pass strategy vacuum.

From
Masahiko Sawada
Date:
On Mon, Mar 31, 2025 at 3:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Mar 31, 2025 at 3:12 PM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> >
> > On Mon, Mar 31, 2025 at 6:03 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > With commit c120550edb86, If we got the cleanup lock on the page,
> > > lazy_scan_prune() marks dead item IDs directly to LP_UNUSED. So the
> > > following check with has_lpdead_items made the periodic FSM vacuum in
> > > the one-pass strategy vacuum no longer being triggered:
> > >
> > >             if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items &&
> > >                 blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES)
> > >             {
> > >                 FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum,
> > >                                         blkno);
> > >                 next_fsm_block_to_vacuum = blkno;
> > >             }
> > >
> > > Before c120550edb86, since we marked dead item IDs to LP_DEAD once
> > > even in the one-pass strategy vacuum, we used to call
> > > lazy_vacuum_heap_page() to vacuum the page and to call
> > > FreeSpaceMapVacuum() periodically, so we had that check.
> >
> > Whoops, yea, I had a feeling that something wasn't right here when I
> > saw that thread a couple weeks ago about FSM vacuuming taking forever
> > at the end of a vacuum and then I remember looking at the code and
> > thinking why is VACUUM_FSM_EVERY_PAGES named that if it only is used
> > when there are no indexes or a multi-pass vacuum.
> >
> > I even made a comment in [1] that we would only do FSM_EVERY_PAGES
> > when we have multi-pass vacuum -- which is even less after 17.
> >
> > Isn't it ironic that I actually broke it.
> >
> > > I've attached a patch to fix it.
> >
> > Looks like you forgot
>
> Oops, attached now.
>
> Looking the places related to VACUUM_FSM_EVERY_PAGES further, the
> comment atop of vacuumlazy.c seems incorrect:
>
> * In between phases, vacuum updates the freespace map (every
> * VACUUM_FSM_EVERY_PAGES).
>
> IIUC the context is two-pass strategy vacuum (i.e., the table has
> indexes) and VACUUM_FSM_EVERY_PAGES is ignored in this case.

I've attached the patch with the commit message. I didn't include the
changes to the comment above in this patch so this is a pure bug
fixing patch.

I'm going to push this fix up to HEAD and v17 early next week, unless
there is no objection.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Periodic FSM vacuum doesn't happen in one-pass strategy vacuum.

From
Melanie Plageman
Date:
On Fri, Apr 4, 2025 at 6:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I'm going to push this fix up to HEAD and v17 early next week, unless
> there is no objection.

I started studying this again looking back at the thread associated
with commit c120550edb86. There was actually a long discussion about
how this commit interacted with how often the freespace map was
vacuumed. [1] is one of the emails addressing that issue. If you grep
for FreeSpaceMapVacuumRange() on the thread, you can see the replies
to this topic, as they are interspersed with replies about updating
the FSM (as opposed to vacuuming it).

What I'm wondering is won't the patch you propose: simply removing
has_lpdead_items from

            if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items &&
                blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES)

lead to vacuuming the FSM when there is nothing to vacuum and thus to
wasting IO (when we didn't set anything to LP_UNUSED). It seems like
the logic we would want is to replace has_lpdead_items with something
about having set items lpunused.

Rereading that thread, it seems we discussed what the right logic for
this was extensively, but I don't quite understand how we ended up
with

            if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items &&

- Melanie

[1] https://www.postgresql.org/message-id/CA%2BTgmoYV5LFUgXS_g4S9P1AhQ63thPg6UJj8EsmsmEahFxLY_A%40mail.gmail.com



Re: Periodic FSM vacuum doesn't happen in one-pass strategy vacuum.

From
Masahiko Sawada
Date:
On Mon, Apr 7, 2025 at 8:30 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Fri, Apr 4, 2025 at 6:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I'm going to push this fix up to HEAD and v17 early next week, unless
> > there is no objection.
>
> I started studying this again looking back at the thread associated
> with commit c120550edb86. There was actually a long discussion about
> how this commit interacted with how often the freespace map was
> vacuumed. [1] is one of the emails addressing that issue. If you grep
> for FreeSpaceMapVacuumRange() on the thread, you can see the replies
> to this topic, as they are interspersed with replies about updating
> the FSM (as opposed to vacuuming it).

Thank you for referring to the thread. I'll read through the discussion.

> What I'm wondering is won't the patch you propose: simply removing
> has_lpdead_items from
>
>             if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items &&
>                 blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES)
>
> lead to vacuuming the FSM when there is nothing to vacuum and thus to
> wasting IO (when we didn't set anything to LP_UNUSED). It seems like
> the logic we would want is to replace has_lpdead_items with something
> about having set items lpunused.

You're right. It would be better to replace it with something as you
suggested instead of just removing it. On the other hand, IIUC even if
there is no garbage on the table, we do FSM vacuum anyway at the end
of lazy_scan_heap(). Doing FSM vacuum periodically regardless of
presence of garbage might help load the disk I/O for FSM vacuum. But
we might think it is rather a bug that we do FSM vacuum even on an
all-frozen table.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com