Re: BitmapHeapScan streaming read user and prelim refactoring - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: BitmapHeapScan streaming read user and prelim refactoring
Date
Msg-id 5a172d1e-d69c-409a-b1fa-6521214c81c2@iki.fi
Whole thread Raw
In response to Re: BitmapHeapScan streaming read user and prelim refactoring  (Melanie Plageman <melanieplageman@gmail.com>)
Responses Re: BitmapHeapScan streaming read user and prelim refactoring
List pgsql-hackers
On 18/03/2024 17:19, Melanie Plageman wrote:
> I've attached v7 rebased over this commit.

Thanks!

> v7-0001-BitmapHeapScan-begin-scan-after-bitmap-creation.patch

If we delayed table_beginscan_bm() call further, after starting the TBM 
iterator, we could skip it altogether when the iterator is empty.

That's a further improvement, doesn't need to be part of this patch set. 
Just caught my eye while reading this.

> v7-0003-Push-BitmapHeapScan-skip-fetch-optimization-into-.patch

I suggest to avoid the double negative with SO_CAN_SKIP_FETCH, and call 
the flag e.g. SO_NEED_TUPLE.


As yet another preliminary patch before the streaming read API, it would 
be nice to move the prefetching code to heapam.c too.

What's the point of having separate table_scan_bitmap_next_block() and 
table_scan_bitmap_next_tuple() functions anymore? The AM owns the TBM 
iterator now. The executor node updates the lossy/exact page counts, but 
that's the only per-page thing it does now.

>         /*
>          * If this is the first scan of the underlying table, create the table
>          * scan descriptor and begin the scan.
>          */
>         if (!scan)
>         {
>             uint32        extra_flags = 0;
> 
>             /*
>              * We can potentially skip fetching heap pages if we do not need
>              * any columns of the table, either for checking non-indexable
>              * quals or for returning data.  This test is a bit simplistic, as
>              * it checks the stronger condition that there's no qual or return
>              * tlist at all. But in most cases it's probably not worth working
>              * harder than that.
>              */
>             if (node->ss.ps.plan->qual == NIL && node->ss.ps.plan->targetlist == NIL)
>                 extra_flags |= SO_CAN_SKIP_FETCH;
> 
>             scan = node->ss.ss_currentScanDesc = table_beginscan_bm(
>                                                                     node->ss.ss_currentRelation,
>                                                                     node->ss.ps.state->es_snapshot,
>                                                                     0,
>                                                                     NULL,
>                                                                     extra_flags);
>         }
> 
>         scan->tbmiterator = tbmiterator;
>         scan->shared_tbmiterator = shared_tbmiterator;

How about passing the iterator as an argument to table_beginscan_bm()? 
You'd then need some other function to change the iterator on rescan, 
though. Not sure what exactly to do here, but feels that this part of 
the API is not fully thought-out. Needs comments at least, to explain 
who sets tbmiterator / shared_tbmiterator and when. For comparison, for 
a TID scan there's a separate scan_set_tidrange() table AM function. 
Maybe follow that example and introduce scan_set_tbm_iterator().

It's bit awkward to have separate tbmiterator and shared_tbmiterator 
fields. Could regular and shared iterators be merged, or wrapped under a 
common interface?

-- 
Heikki Linnakangas
Neon (https://neon.tech)



pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: [HACKERS] make async slave to wait for lsn to be replayed
Next
From: Peter Eisentraut
Date:
Subject: Re: Built-in CTYPE provider