Re: BitmapHeapScan streaming read user and prelim refactoring - Mailing list pgsql-hackers
From | Melanie Plageman |
---|---|
Subject | Re: BitmapHeapScan streaming read user and prelim refactoring |
Date | |
Msg-id | CAAKRu_an3fO4R_n-Zk0=GC-uiH6=xBzhDxHmf_sL+sBo+bxQOA@mail.gmail.com Whole thread Raw |
In response to | Re: BitmapHeapScan streaming read user and prelim refactoring (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
List | pgsql-hackers |
On Sat, Mar 2, 2024 at 5:41 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > > > > On 3/2/24 23:28, Melanie Plageman wrote: > > On Sat, Mar 2, 2024 at 10:05 AM Tomas Vondra > > <tomas.vondra@enterprisedb.com> wrote: > >> > >> Here's a PDF with charts for a dataset where the row selectivity is more > >> correlated to selectivity of pages. I'm attaching the updated script, > >> with the SQL generating the data set. But the short story is all rows on > >> a single page have the same random value, so the selectivity of rows and > >> pages should be the same. > >> > >> The first page has results for the original "uniform", the second page > >> is the new "uniform-pages" data set. There are 4 charts, for > >> master/patched and 0/4 parallel workers. Overall the behavior is the > >> same, but for the "uniform-pages" it's much more gradual (with respect > >> to row selectivity). I think that's expected. > > > > Cool! Thanks for doing this. I have convinced myself that Thomas' > > forthcoming patch which will eliminate prefetching with eic = 0 will > > fix the eic 0 blue line regressions. The eic = 1 with four parallel > > workers is more confusing. And it seems more noticeably bad with your > > randomized-pages dataset. > > > > Regarding your earlier question: > > > >> Just to be sure we're on the same page regarding what eic=1 means, > >> consider a simple sequence of pages: A, B, C, D, E, ... > >> > >> With the current "master" code, eic=1 means we'll issue a prefetch for B > >> and then read+process A. And then issue prefetch for C and read+process > >> B, and so on. It's always one page ahead. > > > > Yes, that is what I mean for eic = 1 > > > >> As for how this is related to eic=1 - I think my point was that these > >> are "adversary" data sets, most likely to show regressions. This applies > >> especially to the "uniform" data set, because as the row selectivity > >> grows, it's more and more likely it's right after to the current one, > >> and so a read-ahead would likely do the trick. > > > > No, I think you're right that eic=1 should prefetch. As you say, with > > high selectivity, a bitmap plan is likely not the best one anyway, so > > not prefetching in order to preserve the performance of those cases > > seems silly. > > > > I was just trying to respond do this from an earlier message: > > > Yes, I would like to see results from a data set where selectivity is > > more correlated to pages/heap fetches. But, I'm not sure I see how > > that is related to prefetching when eic = 1. > > And in that same message you also said "Not doing prefetching with eic 1 > actually seems like the right behavior". Hence my argument we should not > stop prefetching for eic=1. > > But maybe I'm confused - it seems agree eic=1 should prefetch, and that > uniform data set may not be a good argument against that. Yep, we agree. I was being confusing and wrong :) I just wanted to make sure the thread had a clear consensus that, yes, it is the right thing to do to prefetch blocks for bitmap heap scans when effective_io_concurrency = 1. - Melanie
pgsql-hackers by date: