Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum - Mailing list pgsql-bugs
From | Andres Freund |
---|---|
Subject | Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum |
Date | |
Msg-id | 20220311031351.sbge5m2bpvy2ttxg@alap3.anarazel.de Whole thread Raw |
In response to | Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum |
List | pgsql-bugs |
Hi, On 2022-03-10 18:03:46 -0800, Peter Geoghegan wrote: > Attached is v2 revision of the less critical vacuumlazy.c vistest > patch. I will respond to your second email about the pruneheap.c patch > on a new dedicated thread, per your suggestion. > > NB: I intend to commit this revision of the patch (or something pretty > close to it) in the next few days, barring any objections. WFM. > On Wed, Mar 9, 2022 at 4:25 PM Andres Freund <andres@anarazel.de> wrote: > > I think it'd be nicer if we did the horizon determination after allocating > > space for dead tuples... But it's still better than today, so... > > Why would it be nicer? Large allocation can take a bit. Especially dead_item_alloc() sneakily initializes parallelism (which is darn ugly). Determining the horizon after doing expensive stuff gives you a slightly better horizon... > /* > + * Now actually update rel's pg_class entry. > + * > * Aggressive VACUUM must reliably advance relfrozenxid (and relminmxid). > * We are able to advance relfrozenxid in a non-aggressive VACUUM too, > * provided we didn't skip any all-visible (not all-frozen) pages using > @@ -787,7 +832,7 @@ static void > lazy_scan_heap(LVRelState *vacrel, int nworkers) > { > VacDeadItems *dead_items; > - BlockNumber nblocks, > + BlockNumber rel_pages = vacrel->rel_pages, > blkno, > next_unskippable_block, > next_failsafe_block, > @@ -843,7 +865,7 @@ lazy_scan_heap(LVRelState *vacrel, int nworkers) > > /* Report that we're scanning the heap, advertising total # of blocks */ > initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP; > - initprog_val[1] = nblocks; > + initprog_val[1] = rel_pages; > initprog_val[2] = dead_items->max_items; > pgstat_progress_update_multi_param(3, initprog_index, initprog_val); > > @@ -867,9 +889,9 @@ lazy_scan_heap(LVRelState *vacrel, int nworkers) > * Before entering the main loop, establish the invariant that > * next_unskippable_block is the next block number >= blkno that we can't > * skip based on the visibility map, either all-visible for a regular scan > - * or all-frozen for an aggressive scan. We set it to nblocks if there's > - * no such block. We also set up the skipping_blocks flag correctly at > - * this stage. > + * or all-frozen for an aggressive scan. We set it to rel_pages when > + * there's no such block. We also set up the skipping_blocks flag > + * correctly at this stage. > * > * Note: The value returned by visibilitymap_get_status could be slightly > * out-of-date, since we make this test before reading the corresponding > @@ -887,7 +909,7 @@ lazy_scan_heap(LVRelState *vacrel, int nworkers) > next_unskippable_block = 0; > if (vacrel->skipwithvm) > { > - while (next_unskippable_block < nblocks) > + while (next_unskippable_block < rel_pages) > { > uint8 vmstatus; > > @@ -914,7 +936,7 @@ lazy_scan_heap(LVRelState *vacrel, int nworkers) > else > skipping_blocks = false; > > - for (blkno = 0; blkno < nblocks; blkno++) > + for (blkno = 0; blkno < rel_pages; blkno++) > { > Buffer buf; > Page page; > @@ -932,7 +954,7 @@ lazy_scan_heap(LVRelState *vacrel, int nworkers) > next_unskippable_block++; > if (vacrel->skipwithvm) > { > - while (next_unskippable_block < nblocks) > + while (next_unskippable_block < rel_pages) > { > uint8 vmskipflags; > > @@ -977,7 +999,7 @@ lazy_scan_heap(LVRelState *vacrel, int nworkers) > /* > * The current page can be skipped if we've seen a long enough run > * of skippable blocks to justify skipping it -- provided it's not > - * the last page in the relation (according to rel_pages/nblocks). > + * the last page in the relation (according to rel_pages). > * > * We always scan the table's last page to determine whether it > * has tuples or not, even if it would otherwise be skipped. This > @@ -985,7 +1007,7 @@ lazy_scan_heap(LVRelState *vacrel, int nworkers) > * on the table to attempt a truncation that just fails > * immediately because there are tuples on the last page. > */ > - if (skipping_blocks && blkno < nblocks - 1) > + if (skipping_blocks && blkno < rel_pages - 1) > { > /* > * Tricky, tricky. If this is in aggressive vacuum, the page > @@ -1153,7 +1175,7 @@ lazy_scan_heap(LVRelState *vacrel, int nworkers) > * were pruned some time earlier. Also considers freezing XIDs in the > * tuple headers of remaining items with storage. > */ > - lazy_scan_prune(vacrel, buf, blkno, page, vistest, &prunestate); > + lazy_scan_prune(vacrel, buf, blkno, page, &prunestate); > > Assert(!prunestate.all_visible || !prunestate.has_lpdead_items); > > @@ -1352,7 +1374,7 @@ lazy_scan_heap(LVRelState *vacrel, int nworkers) > vacrel->blkno = InvalidBlockNumber; > > /* now we can compute the new value for pg_class.reltuples */ > - vacrel->new_live_tuples = vac_estimate_reltuples(vacrel->rel, nblocks, > + vacrel->new_live_tuples = vac_estimate_reltuples(vacrel->rel, rel_pages, > vacrel->scanned_pages, > vacrel->live_tuples); The whole s/nblocks/rel_pages/ seems like it should be done separately. Greetings, Andres Freund
pgsql-bugs by date: