Re: [DESIGN] ParallelAppend - Mailing list pgsql-hackers
From | Kouhei Kaigai |
---|---|
Subject | Re: [DESIGN] ParallelAppend |
Date | |
Msg-id | 9A28C8860F777E439AA12E8AEA7694F8011300E4@BPXM15GP.gisp.nec.co.jp Whole thread Raw |
In response to | Re: [DESIGN] ParallelAppend (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: [DESIGN] ParallelAppend
|
List | pgsql-hackers |
> On Sat, Aug 1, 2015 at 6:39 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote: > > > > > On Tue, Jul 28, 2015 at 6:08 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote: > > > > > > I am not sure, but what problem do you see in putting Funnel node > > > for one of the relation scans and not for the others. > > > > > At this moment, I'm not certain whether background worker can/ought > > to launch another background workers. > > If sub-Funnel node is executed by 10-processes then it also launch > > 10-processes for each, 100-processes will run for each? > > > > Yes, that could be more work than current, but what I had in mind > is not that way, rather I was thinking that master backend will only > kick of workers for Funnel nodes in plan. > I agree with, it is fair enough approach, so I mention about pull-up of Funnel node. > > > > If we pull Funnel here, I think the plan shall be as follows: > > > > Funnel > > > > --> SeqScan on rel1 > > > > --> PartialSeqScan on rel2 > > > > --> IndexScan on rel3 > > > > > > > > > > So if we go this route, then Funnel should have capability > > > to execute non-parallel part of plan as well, like in this > > > case it should be able to execute non-parallel IndexScan on > > > rel3 as well and then it might need to distinguish between > > > parallel and non-parallel part of plans. I think this could > > > make Funnel node complex. > > > > > It is difference from what I plan now. In the above example, > > Funnel node has two non-parallel aware node (rel1 and rel3) > > and one parallel aware node, then three PlannedStmt for each > > shall be put on the TOC segment. Even though the background > > workers pick up a PlannedStmt from the three, only one worker > > can pick up the PlannedStmt for rel1 and rel3, however, rel2 > > can be executed by multiple workers simultaneously. > > Okay, now I got your point, but I think the cost of execution > of non-parallel node by additional worker is not small considering > the communication cost and setting up an addional worker for > each sub-plan (assume the case where out of 100-child nodes > only few (2 or 3) nodes actually need multiple workers). > It is a competition between traditional Append that takes Funnel children and the new appendable Funnel that takes parallel and non-parallel children. Probably, key factors are cpu_tuple_comm_cost, parallel_setup_cost and degree of selectivity of sub-plans. Both cases has advantage and disadvantage depending on the query, so we can never determine which is better without path consideration. > > > I think for a particular PlannedStmt, number of workers must have > > > been decided before start of execution, so if those many workers are > > > available to work on that particular PlannedStmt, then next/new > > > worker should work on next PlannedStmt. > > > > > My concern about pre-determined number of workers is, it depends on the > > run-time circumstances of concurrent sessions. Even if planner wants to > > assign 10-workers on a particular sub-plan, only 4-workers may be > > available on the run-time because of consumption by side sessions. > > So, I expect only maximum number of workers is meaningful configuration. > > > > In that case, there is possibility that many of the workers are just > working on one or two of the nodes and other nodes execution might > get starved. I understand this is tricky problem to allocate number > of workers for different nodes, however we should try to develop any > algorithm where there is some degree of fairness in allocation of workers > for different nodes. > I like to agree, however, I also want to keep the first version as simple as possible we can. We can develop alternative logic to assign suitable number of workers later. > > > 2. Execution of work by workers and Funnel node and then pass > > > the results back to upper node. I think this needs some more > > > work in addition to ParallelSeqScan patch. > > > > > I expect we can utilize existing infrastructure here. It just picks > > up the records come from the underlying workers, then raise it to > > the upper node. > > > > > Sure, but still you need some work atleast in the area of making > workers understand different node types, I am guessing you need > to modify readfuncs.c to support new plan node if any for this > work. > Yes, it was not a creative work. :-) https://github.com/kaigai/sepgsql/blob/fappend/src/backend/nodes/readfuncs.c#L1479 Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>
pgsql-hackers by date: