Home > mailing lists

Re: BitmapHeapScan streaming read user and prelim refactoring - Mailing list pgsql-hackers

From	Melanie Plageman
Subject	Re: BitmapHeapScan streaming read user and prelim refactoring
Date	March 2, 2024 22:52:49
Msg-id	CAAKRu_an3fO4R_n-Zk0=GC-uiH6=xBzhDxHmf_sL+sBo+bxQOA@mail.gmail.com Whole thread Raw
In response to	Re: BitmapHeapScan streaming read user and prelim refactoring (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List	pgsql-hackers

Tree view

On Sat, Mar 2, 2024 at 5:41 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
>
>
> On 3/2/24 23:28, Melanie Plageman wrote:
> > On Sat, Mar 2, 2024 at 10:05 AM Tomas Vondra
> > <tomas.vondra@enterprisedb.com> wrote:
> >>
> >> Here's a PDF with charts for a dataset where the row selectivity is more
> >> correlated to selectivity of pages. I'm attaching the updated script,
> >> with the SQL generating the data set. But the short story is all rows on
> >> a single page have the same random value, so the selectivity of rows and
> >> pages should be the same.
> >>
> >> The first page has results for the original "uniform", the second page
> >> is the new "uniform-pages" data set. There are 4 charts, for
> >> master/patched and 0/4 parallel workers. Overall the behavior is the
> >> same, but for the "uniform-pages" it's much more gradual (with respect
> >> to row selectivity). I think that's expected.
> >
> > Cool! Thanks for doing this. I have convinced myself that Thomas'
> > forthcoming patch which will eliminate prefetching with eic = 0 will
> > fix the eic 0 blue line regressions. The eic = 1 with four parallel
> > workers is more confusing. And it seems more noticeably bad with your
> > randomized-pages dataset.
> >
> > Regarding your earlier question:
> >
> >> Just to be sure we're on the same page regarding what eic=1 means,
> >> consider a simple sequence of pages: A, B, C, D, E, ...
> >>
> >> With the current "master" code, eic=1 means we'll issue a prefetch for B
> >> and then read+process A. And then issue prefetch for C and read+process
> >> B, and so on. It's always one page ahead.
> >
> > Yes, that is what I mean for eic = 1
> >
> >> As for how this is related to eic=1 - I think my point was that these
> >> are "adversary" data sets, most likely to show regressions. This applies
> >> especially to the "uniform" data set, because as the row selectivity
> >> grows, it's more and more likely it's right after to the current one,
> >> and so a read-ahead would likely do the trick.
> >
> > No, I think you're right that eic=1 should prefetch. As you say, with
> > high selectivity, a bitmap plan is likely not the best one anyway, so
> > not prefetching in order to preserve the performance of those cases
> > seems silly.
> >
>
> I was just trying to respond do this from an earlier message:
>
> > Yes, I would like to see results from a data set where selectivity is
> > more correlated to pages/heap fetches. But, I'm not sure I see how
> > that is related to prefetching when eic = 1.
>
> And in that same message you also said "Not doing prefetching with eic 1
> actually seems like the right behavior". Hence my argument we should not
> stop prefetching for eic=1.
>
> But maybe I'm confused - it seems agree eic=1 should prefetch, and that
> uniform data set may not be a good argument against that.

Yep, we agree. I was being confusing and wrong :) I just wanted to
make sure the thread had a clear consensus that, yes, it is the right
thing to do to prefetch blocks for bitmap heap scans when
effective_io_concurrency = 1.

- Melanie

pgsql-hackers by date:

From: Tomas Vondra
Date: 02 March 2024, 22:51:49
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring

From: Jeff Davis
Date: 02 March 2024, 23:02:00
Subject: Re: Built-in CTYPE provider

Re: BitmapHeapScan streaming read user and prelim refactoring - Mailing list pgsql-hackers

Previous

Next