Home > mailing lists

Re: Parallel bitmap heap scan - Mailing list pgsql-hackers

From	Dilip Kumar
Subject	Re: Parallel bitmap heap scan
Date	November 26, 2016 12:40:25
Msg-id	CAFiTN-uOv70=aJPZk6NL7wbbozYi48g-eygeoM2yefPhc2BkYA@mail.gmail.com Whole thread Raw
In response to	Re: Parallel bitmap heap scan (Dilip Kumar <dilipbalaut@gmail.com>)
Responses	Re: Parallel bitmap heap scan
List	pgsql-hackers

Tree view

On Wed, Nov 23, 2016 at 12:31 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

I tried to address these comments in my new version, All comments are
fixed except below

>> + *
>> + *    #2. Bitmap processing (Iterate and process the pages).
>> + *        . In this phase each worker will iterate over page and
>> chunk array and
>> + *        select heap pages one by one. If prefetch is enable then there will
>> + *        be two iterator.
>> + *        . Since multiple worker are iterating over same page and chunk array
>> + *        we need to have a shared iterator, so we grab a spin lock and iterate
>> + *        within a lock.
>>
>> The formatting of this comment is completely haphazard.
>
> I will fix this..
I have changed the formatting, and also moved the algorithm
description inside function body.
I am not sure does this meet your expectation or we should change it further ?

>
>> +    /* reset parallel bitmap scan info, if present */
>> +    if (node->parallel_bitmap)
>> +    {
>> +        ParallelBitmapInfo *pbminfo = node->parallel_bitmap;
>> +
>> +        pbminfo->state = PBM_INITIAL;
>> +        pbminfo->tbmiterator.schunkbit = 0;
>> +        pbminfo->tbmiterator.spageptr = 0;
>> +        pbminfo->tbmiterator.schunkptr = 0;
>> +        pbminfo->prefetch_iterator.schunkbit = 0;
>> +        pbminfo->prefetch_iterator.spageptr = 0;
>> +        pbminfo->prefetch_iterator.schunkptr = 0;
>> +        pbminfo->prefetch_pages = 0;
>> +        pbminfo->prefetch_target = -1;
>> +    }
>>
>> This is obviously not going to work in the face of concurrent
>> activity.  When we did Parallel Seq Scan, we didn't make any changes
>> to the rescan code at all.  I think we are assuming that there's no
>> way to cause a rescan of the driving table of a parallel query; if
>> that's wrong, we'll need some fix, but this isn't it.  I would just
>> leave this out.
>
> In heap_rescan function we have similar change..
>
> if (scan->rs_parallel != NULL)
> {
>   parallel_scan = scan->rs_parallel;
>   SpinLockAcquire(¶llel_scan->phs_mutex);
>   parallel_scan->phs_cblock = parallel_scan->phs_startblock;
>   SpinLockRelease(¶llel_scan->phs_mutex);
> }
>
> And this is not for driving table, it's required for the case when
> gather is as inner node for JOIN.
> consider below example. I know it's a bad plan but we can produce this plan :)
>
> Outer Node                Inner Node
> SeqScan(t1)   NLJ   (Gather -> Parallel SeqScan (t2)).
>
> Reason for doing same is that, during ExecReScanGather we don't
> recreate new DSM instead of that we just reinitialise the DSM
> (ExecParallelReinitialize).

This is not fixed, reason is already explained.

>> +static bool
>> +pbms_is_leader(ParallelBitmapInfo *pbminfo)
>>
>> I think you should see if you can use Thomas Munro's barrier stuff for
>> this instead.
>
> Okay, I will think over it.

IMHO, barrier is used when multiple worker are doing some work
together in phase1, and before moving to next phase all have to
complete phase1, so we put barrier, so that before starting next phase
all cross the barrier.

But here case is different, only one worker need to finish the phase1,
and as soon as it's over all can start phase2. But we don't have
requirement that all worker should cross certain barrier. In this case
even though some worker did not start, other worker can do their work.

>>
>> In general, the amount of change in nodeBitmapHeapScan.c seems larger
>> than I would have expected.  My copy of that file has 655 lines; this
>> patch adds 544 additional lines.  I think/hope that some of that can
>> be simplified away.
>
> I will work on this.

I have removed some function which was actually not required, and code
can be merged in main function. Almost reduced by 100 lines.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Amit Kapila
Date: 26 November 2016, 12:11:25
Subject: Re: make default TABLESPACE belong to target table.

From: Dilip Kumar
Date: 26 November 2016, 12:49:05
Subject: Re: Parallel bitmap heap scan

Re: Parallel bitmap heap scan - Mailing list pgsql-hackers

Previous

Next