Thread: Periodic FSM vacuum doesn't happen in one-pass strategy vacuum.
Hi, With commit c120550edb86, If we got the cleanup lock on the page, lazy_scan_prune() marks dead item IDs directly to LP_UNUSED. So the following check with has_lpdead_items made the periodic FSM vacuum in the one-pass strategy vacuum no longer being triggered: if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items && blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES) { FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, blkno); next_fsm_block_to_vacuum = blkno; } Before c120550edb86, since we marked dead item IDs to LP_DEAD once even in the one-pass strategy vacuum, we used to call lazy_vacuum_heap_page() to vacuum the page and to call FreeSpaceMapVacuum() periodically, so we had that check. I've attached a patch to fix it. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Mon, Mar 31, 2025 at 6:03 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > With commit c120550edb86, If we got the cleanup lock on the page, > lazy_scan_prune() marks dead item IDs directly to LP_UNUSED. So the > following check with has_lpdead_items made the periodic FSM vacuum in > the one-pass strategy vacuum no longer being triggered: > > if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items && > blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES) > { > FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, > blkno); > next_fsm_block_to_vacuum = blkno; > } > > Before c120550edb86, since we marked dead item IDs to LP_DEAD once > even in the one-pass strategy vacuum, we used to call > lazy_vacuum_heap_page() to vacuum the page and to call > FreeSpaceMapVacuum() periodically, so we had that check. Whoops, yea, I had a feeling that something wasn't right here when I saw that thread a couple weeks ago about FSM vacuuming taking forever at the end of a vacuum and then I remember looking at the code and thinking why is VACUUM_FSM_EVERY_PAGES named that if it only is used when there are no indexes or a multi-pass vacuum. I even made a comment in [1] that we would only do FSM_EVERY_PAGES when we have multi-pass vacuum -- which is even less after 17. Isn't it ironic that I actually broke it. > I've attached a patch to fix it. Looks like you forgot - Melanie [1] https://www.postgresql.org/message-id/CAAKRu_aXqoj2Vfqu3yzscsTX%3D2nPQ4y-aadCNz6nJP_o12GyxA%40mail.gmail.com
On Mon, Mar 31, 2025 at 3:12 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Mon, Mar 31, 2025 at 6:03 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > With commit c120550edb86, If we got the cleanup lock on the page, > > lazy_scan_prune() marks dead item IDs directly to LP_UNUSED. So the > > following check with has_lpdead_items made the periodic FSM vacuum in > > the one-pass strategy vacuum no longer being triggered: > > > > if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items && > > blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES) > > { > > FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, > > blkno); > > next_fsm_block_to_vacuum = blkno; > > } > > > > Before c120550edb86, since we marked dead item IDs to LP_DEAD once > > even in the one-pass strategy vacuum, we used to call > > lazy_vacuum_heap_page() to vacuum the page and to call > > FreeSpaceMapVacuum() periodically, so we had that check. > > Whoops, yea, I had a feeling that something wasn't right here when I > saw that thread a couple weeks ago about FSM vacuuming taking forever > at the end of a vacuum and then I remember looking at the code and > thinking why is VACUUM_FSM_EVERY_PAGES named that if it only is used > when there are no indexes or a multi-pass vacuum. > > I even made a comment in [1] that we would only do FSM_EVERY_PAGES > when we have multi-pass vacuum -- which is even less after 17. > > Isn't it ironic that I actually broke it. > > > I've attached a patch to fix it. > > Looks like you forgot Oops, attached now. Looking the places related to VACUUM_FSM_EVERY_PAGES further, the comment atop of vacuumlazy.c seems incorrect: * In between phases, vacuum updates the freespace map (every * VACUUM_FSM_EVERY_PAGES). IIUC the context is two-pass strategy vacuum (i.e., the table has indexes) and VACUUM_FSM_EVERY_PAGES is ignored in this case. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Attachment
On Mon, Mar 31, 2025 at 3:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Mon, Mar 31, 2025 at 3:12 PM Melanie Plageman > <melanieplageman@gmail.com> wrote: > > > > On Mon, Mar 31, 2025 at 6:03 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > With commit c120550edb86, If we got the cleanup lock on the page, > > > lazy_scan_prune() marks dead item IDs directly to LP_UNUSED. So the > > > following check with has_lpdead_items made the periodic FSM vacuum in > > > the one-pass strategy vacuum no longer being triggered: > > > > > > if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items && > > > blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES) > > > { > > > FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, > > > blkno); > > > next_fsm_block_to_vacuum = blkno; > > > } > > > > > > Before c120550edb86, since we marked dead item IDs to LP_DEAD once > > > even in the one-pass strategy vacuum, we used to call > > > lazy_vacuum_heap_page() to vacuum the page and to call > > > FreeSpaceMapVacuum() periodically, so we had that check. > > > > Whoops, yea, I had a feeling that something wasn't right here when I > > saw that thread a couple weeks ago about FSM vacuuming taking forever > > at the end of a vacuum and then I remember looking at the code and > > thinking why is VACUUM_FSM_EVERY_PAGES named that if it only is used > > when there are no indexes or a multi-pass vacuum. > > > > I even made a comment in [1] that we would only do FSM_EVERY_PAGES > > when we have multi-pass vacuum -- which is even less after 17. > > > > Isn't it ironic that I actually broke it. > > > > > I've attached a patch to fix it. > > > > Looks like you forgot > > Oops, attached now. > > Looking the places related to VACUUM_FSM_EVERY_PAGES further, the > comment atop of vacuumlazy.c seems incorrect: > > * In between phases, vacuum updates the freespace map (every > * VACUUM_FSM_EVERY_PAGES). > > IIUC the context is two-pass strategy vacuum (i.e., the table has > indexes) and VACUUM_FSM_EVERY_PAGES is ignored in this case. I've attached the patch with the commit message. I didn't include the changes to the comment above in this patch so this is a pure bug fixing patch. I'm going to push this fix up to HEAD and v17 early next week, unless there is no objection. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Attachment
On Fri, Apr 4, 2025 at 6:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > I'm going to push this fix up to HEAD and v17 early next week, unless > there is no objection. I started studying this again looking back at the thread associated with commit c120550edb86. There was actually a long discussion about how this commit interacted with how often the freespace map was vacuumed. [1] is one of the emails addressing that issue. If you grep for FreeSpaceMapVacuumRange() on the thread, you can see the replies to this topic, as they are interspersed with replies about updating the FSM (as opposed to vacuuming it). What I'm wondering is won't the patch you propose: simply removing has_lpdead_items from if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items && blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES) lead to vacuuming the FSM when there is nothing to vacuum and thus to wasting IO (when we didn't set anything to LP_UNUSED). It seems like the logic we would want is to replace has_lpdead_items with something about having set items lpunused. Rereading that thread, it seems we discussed what the right logic for this was extensively, but I don't quite understand how we ended up with if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items && - Melanie [1] https://www.postgresql.org/message-id/CA%2BTgmoYV5LFUgXS_g4S9P1AhQ63thPg6UJj8EsmsmEahFxLY_A%40mail.gmail.com
On Mon, Apr 7, 2025 at 8:30 AM Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Fri, Apr 4, 2025 at 6:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > I'm going to push this fix up to HEAD and v17 early next week, unless > > there is no objection. > > I started studying this again looking back at the thread associated > with commit c120550edb86. There was actually a long discussion about > how this commit interacted with how often the freespace map was > vacuumed. [1] is one of the emails addressing that issue. If you grep > for FreeSpaceMapVacuumRange() on the thread, you can see the replies > to this topic, as they are interspersed with replies about updating > the FSM (as opposed to vacuuming it). Thank you for referring to the thread. I'll read through the discussion. > What I'm wondering is won't the patch you propose: simply removing > has_lpdead_items from > > if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items && > blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES) > > lead to vacuuming the FSM when there is nothing to vacuum and thus to > wasting IO (when we didn't set anything to LP_UNUSED). It seems like > the logic we would want is to replace has_lpdead_items with something > about having set items lpunused. You're right. It would be better to replace it with something as you suggested instead of just removing it. On the other hand, IIUC even if there is no garbage on the table, we do FSM vacuum anyway at the end of lazy_scan_heap(). Doing FSM vacuum periodically regardless of presence of garbage might help load the disk I/O for FSM vacuum. But we might think it is rather a bug that we do FSM vacuum even on an all-frozen table. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com