Re: recovering from "found xmin ... from before relfrozenxid ..." - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: recovering from "found xmin ... from before relfrozenxid ..." |
Date | |
Msg-id | CA+fd4k6+JWq2MfQt5b7fSJ2wMvCes9TRfbDhVO_fQP9B8JJRAA@mail.gmail.com Whole thread Raw |
In response to | Re: recovering from "found xmin ... from before relfrozenxid ..." (Ashutosh Sharma <ashu.coek88@gmail.com>) |
Responses |
Re: recovering from "found xmin ... from before relfrozenxid ..."
Re: recovering from "found xmin ... from before relfrozenxid ..." |
List | pgsql-hackers |
On Fri, 21 Aug 2020 at 22:25, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > Hi Masahiko-san, > > Please find the updated patch with the following new changes: > Thank you for updating the patch! > 1) It adds the code changes in heap_force_kill function to clear an > all-visible bit on the visibility map corresponding to the page that > is marked all-visible. Along the way it also clears PD_ALL_VISIBLE > flag on the page header. I think we need to clear all visibility map bits by using VISIBILITYMAP_VALID_BITS. Otherwise, the page has all-frozen bit but not all-visible bit, which is not a valid state. > > 2) It adds the code changes in heap_force_freeze function to reset the > ctid value in a tuple header if it is corrupted. > > 3) It adds several notes and examples in the documentation stating > when and how we need to use the functions provided by this module. > > Please have a look and let me know for any other concern. > And here are small comments on the heap_surgery.c: + /* + * Get the offset numbers from the tids belonging to one particular page + * and process them one by one. + */ + blkno = tids_same_page_fetch_offnums(tids, ntids, &next_start_ptr, + offnos); + + /* Calculate the number of offsets stored in offnos array. */ + noffs = next_start_ptr - curr_start_ptr; + + /* + * Update the current start pointer so that next time when + * tids_same_page_fetch_offnums() is called, we can calculate the number + * of offsets present in the offnos array. + */ + curr_start_ptr = next_start_ptr; + + /* Check whether the block number is valid. */ + if (blkno >= nblocks) + { + ereport(NOTICE, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("skipping block %u for relation \"%s\" because the block number is out of range", + blkno, RelationGetRelationName(rel)))); + continue; + } + + CHECK_FOR_INTERRUPTS(); I guess it would be better to call CHECK_FOR_INTERRUPTS() at the top of the do loop for safety. I think it's unlikely to happen but the user might mistakenly specify a lot of wrong block numbers. ---- + offnos = (OffsetNumber *) palloc(ntids * sizeof(OffsetNumber)); + noffs = curr_start_ptr = next_start_ptr = 0; + nblocks = RelationGetNumberOfBlocks(rel); + + do + { (snip) + + /* + * Get the offset numbers from the tids belonging to one particular page + * and process them one by one. + */ + blkno = tids_same_page_fetch_offnums(tids, ntids, &next_start_ptr, + offnos); + + /* Calculate the number of offsets stored in offnos array. */ + noffs = next_start_ptr - curr_start_ptr; + (snip) + /* No ereport(ERROR) from here until all the changes are logged. */ + START_CRIT_SECTION(); + + for (i = 0; i < noffs; i++) You copy all offset numbers belonging to the same page to palloc'd array, offnos, and iterate it while processing the tuples. I might be missing something but I think we can do that without allocating the space for offset numbers. Is there any reason for this? I guess we can do that by just iterating the sorted tids array. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: