Thread: Periodic FSM vacuum doesn't happen in one-pass strategy vacuum.
Hi, With commit c120550edb86, If we got the cleanup lock on the page, lazy_scan_prune() marks dead item IDs directly to LP_UNUSED. So the following check with has_lpdead_items made the periodic FSM vacuum in the one-pass strategy vacuum no longer being triggered: if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items && blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES) { FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, blkno); next_fsm_block_to_vacuum = blkno; } Before c120550edb86, since we marked dead item IDs to LP_DEAD once even in the one-pass strategy vacuum, we used to call lazy_vacuum_heap_page() to vacuum the page and to call FreeSpaceMapVacuum() periodically, so we had that check. I've attached a patch to fix it. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Mon, Mar 31, 2025 at 6:03 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > With commit c120550edb86, If we got the cleanup lock on the page, > lazy_scan_prune() marks dead item IDs directly to LP_UNUSED. So the > following check with has_lpdead_items made the periodic FSM vacuum in > the one-pass strategy vacuum no longer being triggered: > > if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items && > blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES) > { > FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, > blkno); > next_fsm_block_to_vacuum = blkno; > } > > Before c120550edb86, since we marked dead item IDs to LP_DEAD once > even in the one-pass strategy vacuum, we used to call > lazy_vacuum_heap_page() to vacuum the page and to call > FreeSpaceMapVacuum() periodically, so we had that check. Whoops, yea, I had a feeling that something wasn't right here when I saw that thread a couple weeks ago about FSM vacuuming taking forever at the end of a vacuum and then I remember looking at the code and thinking why is VACUUM_FSM_EVERY_PAGES named that if it only is used when there are no indexes or a multi-pass vacuum. I even made a comment in [1] that we would only do FSM_EVERY_PAGES when we have multi-pass vacuum -- which is even less after 17. Isn't it ironic that I actually broke it. > I've attached a patch to fix it. Looks like you forgot - Melanie [1] https://www.postgresql.org/message-id/CAAKRu_aXqoj2Vfqu3yzscsTX%3D2nPQ4y-aadCNz6nJP_o12GyxA%40mail.gmail.com
On Mon, Mar 31, 2025 at 3:12 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Mon, Mar 31, 2025 at 6:03 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > With commit c120550edb86, If we got the cleanup lock on the page, > > lazy_scan_prune() marks dead item IDs directly to LP_UNUSED. So the > > following check with has_lpdead_items made the periodic FSM vacuum in > > the one-pass strategy vacuum no longer being triggered: > > > > if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items && > > blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES) > > { > > FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, > > blkno); > > next_fsm_block_to_vacuum = blkno; > > } > > > > Before c120550edb86, since we marked dead item IDs to LP_DEAD once > > even in the one-pass strategy vacuum, we used to call > > lazy_vacuum_heap_page() to vacuum the page and to call > > FreeSpaceMapVacuum() periodically, so we had that check. > > Whoops, yea, I had a feeling that something wasn't right here when I > saw that thread a couple weeks ago about FSM vacuuming taking forever > at the end of a vacuum and then I remember looking at the code and > thinking why is VACUUM_FSM_EVERY_PAGES named that if it only is used > when there are no indexes or a multi-pass vacuum. > > I even made a comment in [1] that we would only do FSM_EVERY_PAGES > when we have multi-pass vacuum -- which is even less after 17. > > Isn't it ironic that I actually broke it. > > > I've attached a patch to fix it. > > Looks like you forgot Oops, attached now. Looking the places related to VACUUM_FSM_EVERY_PAGES further, the comment atop of vacuumlazy.c seems incorrect: * In between phases, vacuum updates the freespace map (every * VACUUM_FSM_EVERY_PAGES). IIUC the context is two-pass strategy vacuum (i.e., the table has indexes) and VACUUM_FSM_EVERY_PAGES is ignored in this case. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Attachment
On Mon, Mar 31, 2025 at 3:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Mon, Mar 31, 2025 at 3:12 PM Melanie Plageman > <melanieplageman@gmail.com> wrote: > > > > On Mon, Mar 31, 2025 at 6:03 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > With commit c120550edb86, If we got the cleanup lock on the page, > > > lazy_scan_prune() marks dead item IDs directly to LP_UNUSED. So the > > > following check with has_lpdead_items made the periodic FSM vacuum in > > > the one-pass strategy vacuum no longer being triggered: > > > > > > if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items && > > > blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES) > > > { > > > FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, > > > blkno); > > > next_fsm_block_to_vacuum = blkno; > > > } > > > > > > Before c120550edb86, since we marked dead item IDs to LP_DEAD once > > > even in the one-pass strategy vacuum, we used to call > > > lazy_vacuum_heap_page() to vacuum the page and to call > > > FreeSpaceMapVacuum() periodically, so we had that check. > > > > Whoops, yea, I had a feeling that something wasn't right here when I > > saw that thread a couple weeks ago about FSM vacuuming taking forever > > at the end of a vacuum and then I remember looking at the code and > > thinking why is VACUUM_FSM_EVERY_PAGES named that if it only is used > > when there are no indexes or a multi-pass vacuum. > > > > I even made a comment in [1] that we would only do FSM_EVERY_PAGES > > when we have multi-pass vacuum -- which is even less after 17. > > > > Isn't it ironic that I actually broke it. > > > > > I've attached a patch to fix it. > > > > Looks like you forgot > > Oops, attached now. > > Looking the places related to VACUUM_FSM_EVERY_PAGES further, the > comment atop of vacuumlazy.c seems incorrect: > > * In between phases, vacuum updates the freespace map (every > * VACUUM_FSM_EVERY_PAGES). > > IIUC the context is two-pass strategy vacuum (i.e., the table has > indexes) and VACUUM_FSM_EVERY_PAGES is ignored in this case. I've attached the patch with the commit message. I didn't include the changes to the comment above in this patch so this is a pure bug fixing patch. I'm going to push this fix up to HEAD and v17 early next week, unless there is no objection. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Attachment
On Fri, Apr 4, 2025 at 6:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > I'm going to push this fix up to HEAD and v17 early next week, unless > there is no objection. I started studying this again looking back at the thread associated with commit c120550edb86. There was actually a long discussion about how this commit interacted with how often the freespace map was vacuumed. [1] is one of the emails addressing that issue. If you grep for FreeSpaceMapVacuumRange() on the thread, you can see the replies to this topic, as they are interspersed with replies about updating the FSM (as opposed to vacuuming it). What I'm wondering is won't the patch you propose: simply removing has_lpdead_items from if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items && blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES) lead to vacuuming the FSM when there is nothing to vacuum and thus to wasting IO (when we didn't set anything to LP_UNUSED). It seems like the logic we would want is to replace has_lpdead_items with something about having set items lpunused. Rereading that thread, it seems we discussed what the right logic for this was extensively, but I don't quite understand how we ended up with if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items && - Melanie [1] https://www.postgresql.org/message-id/CA%2BTgmoYV5LFUgXS_g4S9P1AhQ63thPg6UJj8EsmsmEahFxLY_A%40mail.gmail.com
On Mon, Apr 7, 2025 at 8:30 AM Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Fri, Apr 4, 2025 at 6:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > I'm going to push this fix up to HEAD and v17 early next week, unless > > there is no objection. > > I started studying this again looking back at the thread associated > with commit c120550edb86. There was actually a long discussion about > how this commit interacted with how often the freespace map was > vacuumed. [1] is one of the emails addressing that issue. If you grep > for FreeSpaceMapVacuumRange() on the thread, you can see the replies > to this topic, as they are interspersed with replies about updating > the FSM (as opposed to vacuuming it). Thank you for referring to the thread. I'll read through the discussion. > What I'm wondering is won't the patch you propose: simply removing > has_lpdead_items from > > if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items && > blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES) > > lead to vacuuming the FSM when there is nothing to vacuum and thus to > wasting IO (when we didn't set anything to LP_UNUSED). It seems like > the logic we would want is to replace has_lpdead_items with something > about having set items lpunused. You're right. It would be better to replace it with something as you suggested instead of just removing it. On the other hand, IIUC even if there is no garbage on the table, we do FSM vacuum anyway at the end of lazy_scan_heap(). Doing FSM vacuum periodically regardless of presence of garbage might help load the disk I/O for FSM vacuum. But we might think it is rather a bug that we do FSM vacuum even on an all-frozen table. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Mon, Apr 7, 2025 at 11:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Mon, Apr 7, 2025 at 8:30 AM Melanie Plageman > <melanieplageman@gmail.com> wrote: > > > > On Fri, Apr 4, 2025 at 6:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > I'm going to push this fix up to HEAD and v17 early next week, unless > > > there is no objection. > > > > I started studying this again looking back at the thread associated > > with commit c120550edb86. There was actually a long discussion about > > how this commit interacted with how often the freespace map was > > vacuumed. [1] is one of the emails addressing that issue. If you grep > > for FreeSpaceMapVacuumRange() on the thread, you can see the replies > > to this topic, as they are interspersed with replies about updating > > the FSM (as opposed to vacuuming it). > > Thank you for referring to the thread. I'll read through the discussion. > > > What I'm wondering is won't the patch you propose: simply removing > > has_lpdead_items from > > > > if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items && > > blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES) > > > > lead to vacuuming the FSM when there is nothing to vacuum and thus to > > wasting IO (when we didn't set anything to LP_UNUSED). It seems like > > the logic we would want is to replace has_lpdead_items with something > > about having set items lpunused. > > You're right. It would be better to replace it with something as you > suggested instead of just removing it. On the other hand, IIUC even if > there is no garbage on the table, we do FSM vacuum anyway at the end > of lazy_scan_heap(). Doing FSM vacuum periodically regardless of > presence of garbage might help load the disk I/O for FSM vacuum. But > we might think it is rather a bug that we do FSM vacuum even on an > all-frozen table. I think that this issue presents since commit c120550edb86 but this commit optimized the vacuum work for tables with no indexes and wasn't intended to change the FSM vacuum behavior for such tables. Therefore, I think we can fix the FSM vacuum to restore the original FSM vacuum behavior for HEAD and v18, leaving aside the fact that we might want to fix/improve VACUUM_FSM_EVERY_PAGES optimization . The original FSM vacuum strategy for tables with no index was that we call FreeSpaceMapVacuumRange() for every VACUUM_FSM_EVERY_PAGES, followed by a final FreeSpaceMapVacuumRange() call for any remaining pages at end of lazy_scan_heap(): /* * Vacuum the remainder of the Free Space Map. We must do this whether or * not there were indexes, and whether or not we bypassed index vacuuming. * We can pass rel_pages here because we never skip scanning the last * block of the relation. */ if (rel_pages > next_fsm_block_to_vacuum) FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, rel_pages); Therefore, we do FSM vacuum for the entire table after all even if the table has no garbage. As for VACUUM_FSM_EVERY_PAGES optimization, we call FreeSpaceMapVacuumRange() only if we remove at least one tuple from the page. I think these two points should have been maintained post-commit. I've attached the patch implementing this idea. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Attachment
On Mon, Jun 2, 2025 at 8:15 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > I think that this issue presents since commit c120550edb86 but this > commit optimized the vacuum work for tables with no indexes and wasn't > intended to change the FSM vacuum behavior for such tables. Therefore, > I think we can fix the FSM vacuum to restore the original FSM vacuum > behavior for HEAD and v18, leaving aside the fact that we might want > to fix/improve VACUUM_FSM_EVERY_PAGES optimization . The original FSM > vacuum strategy for tables with no index was that we call > FreeSpaceMapVacuumRange() for every VACUUM_FSM_EVERY_PAGES, followed > by a final FreeSpaceMapVacuumRange() call for any remaining pages at > end of lazy_scan_heap(): Agreed. We should treat this as a bug fix and could separately reduce unneeded FSM vacuuming. Reviewing your patch, I think there might be an issue still. You replaced has_lpdead_items with ndeleted. While ndeleted will count those items we set LP_UNUSED (which is what we want), it also counts LP_NORMAL items that vacuum sets LP_REDIRECT (see heap_prune_record_redirect()). Looking at PageGetHeapFreeSpace(), it seems like it only considers LP_UNUSED items as freespace. So, if an item is set LP_REDIRECT, there would be no need to update the FSM, I think. I started to wonder why we counted setting an LP_NORMAL item LP_REDIRECT as a "deleted" tuple. I refactored the code in heap_prune_chain() in 17, adding heap_prune_record_redirect(), which increments prstate->ndeleted if the root was LP_NORMAL. This was the equivalent of the following heap_prune_chain() code in PG 16: if (OffsetNumberIsValid(latestdead)) { ... /* * If the root entry had been a normal tuple, we are deleting it, so * count it in the result. But changing a redirect (even to DEAD * state) doesn't count. */ if (ItemIdIsNormal(rootlp)) ndeleted++; /* * If the DEAD tuple is at the end of the chain, the entire chain is * dead and the root line pointer can be marked dead. Otherwise just * redirect the root to the correct chain member. */ if (i >= nchain) heap_prune_record_dead(prstate, rootoffnum); else heap_prune_record_redirect(prstate, rootoffnum, chainitems[i]); } So, I don't think I changed the behavior in 17 with my refactor. But I don't know why we would want to count items set LP_REDIRECT as "reclaimed". I couldn't actually come up with a case where there was an LP_NORMAL line pointer that vacuum set LP_REDIRECT and there were no intervening line pointers in the chain that were being set LP_UNUSED. So, maybe it is not a problem for this patch in practice. I would like to understand why we count LP_NORMAL -> LP_REDIRECT items in the deleted count -- even though that isn't the responsibility of this patch. - Melanie
On Mon, Jun 9, 2025 at 6:22 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > > Reviewing your patch, I think there might be an issue still. You > replaced has_lpdead_items with ndeleted. While ndeleted will count > those items we set LP_UNUSED (which is what we want), it also counts > LP_NORMAL items that vacuum sets LP_REDIRECT (see > heap_prune_record_redirect()). > > Looking at PageGetHeapFreeSpace(), it seems like it only considers > LP_UNUSED items as freespace. So, if an item is set LP_REDIRECT, there > would be no need to update the FSM, I think. I was wrong. Setting an item to LP_REDIRECT does free up space -- because there is no associated storage. PageGetFreeSpace() is concerned with pd_upper and pd_lower. So after setting an item LP_REDIRECT, we'll compactify_tuples() in PageRepairFragmentation(), altering pd_upper/lower. Then future calls to PageGetHeapFreeSpace() will report this updated number and we'll put that in the freespace map. So, we can expect setting items LP_REDIRECT will result in new freespace to report. (I got confused by some code in PageGetHeapFreeSpace() that was checking to make sure that, if there were >= MaxHeapTuplesPerPage line pointers, that at least one of them is LP_UNUSED). So, I think we should commit the fix you proposed. The only question I have left is implementation: should we have ndeleted as an output parameter of lazy_scan_prune() or have lazy_scan_prune() return it (instead of void)? In <= 16, heap_page_prune() returned the number of tuples deleted, so I thought of maybe having lazy_scan_prune() do this. Though, maybe it is confusing to have one result returned as the return value and the others returned in output parameters unless there is something more special about ndeleted. With heap_page_prune(), I think it was the return value because that was kind of what heap_page_prune() "accomplished". - Melanie
On Tue, Jun 24, 2025 at 6:59 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > > So, I think we should commit the fix you proposed. > > The only question I have left is implementation: should we have > ndeleted as an output parameter of lazy_scan_prune() or have > lazy_scan_prune() return it (instead of void)? > > In <= 16, heap_page_prune() returned the number of tuples deleted, so > I thought of maybe having lazy_scan_prune() do this. Though, maybe it > is confusing to have one result returned as the return value and the > others returned in output parameters unless there is something more > special about ndeleted. With heap_page_prune(), I think it was the > return value because that was kind of what heap_page_prune() > "accomplished". Hi Sawada-san, Just checking what you thought about this. We probably want to get this committed and backported relatively soon. I'm happy to help with that if needed but just want to make sure we are on the same page about the fix. - Melanie
On Mon, Jun 30, 2025 at 10:20 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Tue, Jun 24, 2025 at 6:59 PM Melanie Plageman > <melanieplageman@gmail.com> wrote: > > > > So, I think we should commit the fix you proposed. > > > > The only question I have left is implementation: should we have > > ndeleted as an output parameter of lazy_scan_prune() or have > > lazy_scan_prune() return it (instead of void)? > > > > In <= 16, heap_page_prune() returned the number of tuples deleted, so > > I thought of maybe having lazy_scan_prune() do this. Though, maybe it > > is confusing to have one result returned as the return value and the > > others returned in output parameters unless there is something more > > special about ndeleted. With heap_page_prune(), I think it was the > > return value because that was kind of what heap_page_prune() > > "accomplished". > > Hi Sawada-san, > > Just checking what you thought about this. We probably want to get > this committed and backported relatively soon. I'm happy to help with > that if needed but just want to make sure we are on the same page > about the fix. > Sorry for the late response, I was unable to work on this last week. I'll check your reply and the solution tomorrow, and will get back to you with my thoughts. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Tue, Jul 1, 2025 at 12:01 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > I've attached the updated patches for master and backbranches (v17 and > v18). Please review these patches. All look good to me. One nitpick that is up to you if you want to change: the comment * Return the number of tuples deleted from the page during HOT pruning. is at the top of the function block comment for lazy_scan_prune() in your patch. Most of the function block comments in vacuumlazy.c list the return value last after describing the other parameters and use "Returns" as opposed to the imperative conjugation "Return". Thanks so much for finding and fixing my bug! - Melanie