Re: BitmapHeapScan streaming read user and prelim refactoring - Mailing list pgsql-hackers
From | Melanie Plageman |
---|---|
Subject | Re: BitmapHeapScan streaming read user and prelim refactoring |
Date | |
Msg-id | CAAKRu_bEZPgAsMK49Kn6JP_+Zq-d6rQP7enP7nk1CYLbHKMv_A@mail.gmail.com Whole thread Raw |
In response to | Re: BitmapHeapScan streaming read user and prelim refactoring (Andres Freund <andres@anarazel.de>) |
Responses |
Re: BitmapHeapScan streaming read user and prelim refactoring
|
List | pgsql-hackers |
On Sat, Mar 22, 2025 at 5:04 PM Andres Freund <andres@anarazel.de> wrote: > > The problem is that sometimes recheck is performed for pending empty > tuples. The comment about empty tuples says: > > /* > * Bitmap is exhausted. Time to emit empty tuples if relevant. We emit > * all empty tuples at the end instead of emitting them per block we > * skip fetching. This is necessary because the streaming read API > * will only return TBMIterateResults for blocks actually fetched. > * When we skip fetching a block, we keep track of how many empty > * tuples to emit at the end of the BitmapHeapScan. We do not recheck > * all NULL tuples. > */ > *recheck = false; > return bscan->rs_empty_tuples_pending > 0; > > But we actually emit empty tuples at each page boundary: > > /* > * Out of range? If so, nothing more to look at on this page > */ > while (hscan->rs_cindex >= hscan->rs_ntuples) > { > /* > * Emit empty tuples before advancing to the next block > */ > if (bscan->rs_empty_tuples_pending > 0) > { > /* > * If we don't have to fetch the tuple, just return nulls. > */ > ExecStoreAllNullTuple(slot); > bscan->rs_empty_tuples_pending--; > return true; > } > > /* > * Returns false if the bitmap is exhausted and there are no further > * blocks we need to scan. > */ > if (!BitmapHeapScanNextBlock(scan, recheck, lossy_pages, exact_pages)) > return false; > } > > So we don't just process tuples at the end of the scan, but at each page > boundary. Whoops! I think an earlier version of the patch did emit all the empty tuples at the end, but I changed the control flow without moving the recheck = false and removing/modifying that comment. Quite embarrassing. > This leads to wrong results whenever there is a rs_empty_tuples_pending > 0 > after a previous page that needed rechecks, because nothing will reset > *recheck = false. > > And indeed, if I add a *recheck = false in that inside the > /* > * Emit empty tuples before advancing to the next block > */ > if (bscan->rs_empty_tuples_pending > 0) > { > the problem goes away. > > > A fix should do slightly more, given that the "at the end" comment is wrong. > > But I'm wondering if it's worth fixing it, or if we should just rip the logic > out alltogether, due to the whole VM checking being unsound as we learned in > the thread I referenced earlie, without even bothering to fix the bug here. Perhaps it is better I just fix it since ripping out the skip fetch optimization has to be backported and even though that will look very different on master than on backbranches, I wonder if people will look for a "clean" commit on master which removes the feature. - Melanie
pgsql-hackers by date: