Re: Asynchronous execution on FDW - Mailing list pgsql-hackers
From | Kouhei Kaigai |
---|---|
Subject | Re: Asynchronous execution on FDW |
Date | |
Msg-id | 9A28C8860F777E439AA12E8AEA7694F80111BCEC@BPXM15GP.gisp.nec.co.jp Whole thread Raw |
In response to | Re: Asynchronous execution on FDW (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>) |
Responses |
Re: Asynchronous execution on FDW
|
List | pgsql-hackers |
> > If we have ParallelAppend node that kicks a background worker process for > > each underlying child node in parallel, does ForeignScan need to do something > > special? > > Although I don't see the point of the background worker in your > story but at least for ParalleMergeAppend, it would frequently > discontinues to scan by upper Limit so one more state, say setup > - which mans a worker is allocated but not started- would be > useful and the driver node might need to manage the number of > async execution. Or the driven nodes might do so inversely. > I expected workloads like single shot scan on a partitioned large fact table on DWH system. Yep, if workload is expected to rescan so frequently, its expected cost shall be higher (by the cost to launch bgworker) than existing Append, then planner will kick out this path. Regarding of interaction between Limit and ParallelMergeAppend, it is probably the best scenario, isn't it? If Limit picks up the least 1000rows from a partitioned table consists of 20 child tables, ParallelMergeAppend can launch 20 parallel jobs that picks up the least 1000rows from the child relations for each. Probably, it is same job done in pass_down_bound() of nodeLimit.c. > As for ForeignScan, it is merely an API for FDW and does nothing > substantial so it would have nothing special to do. As for > postgres_fdw, current patch restricts one execution per one > foreign server at once by itself. We would have to provide > another execution management if we want to have two or more > simultaneous scans per one foreign server at once. > Yep, your 4th patch defines a new callback to FdwRoutines and 5th patch implements postgres_fdw specific portion. It shall work for distributed / shaded database environment well, however, its benefit is around ForeignScan only. Once management node kicks underlying SeqScan, ForeignScan or others in parallel, it also enables to run local heap scan asynchronously. > Sorry for the focusless discussion but does this answer some of > your question? > Hmm... Its advantage is still unclear for me. However, it is not fair to hijack this thread by my idea. I'll submit my design proposal about ParallelAppend towards the next commit-fest. Please comment on. > > Expected waste of CPU or I/O is common problem to be solved, however, it does > > not need to add a special case handling to ForeignScan, I think. > > How about your opinion? > > I agree with you that ForeignScan as the wrapper for FDWs don't > need anything special for the case. I suppose for now that > avoiding the penalty from abandoning too many speculatively > executed scans (or other works on bg worker like sorts) would be > a business of the upper node of FDWs, or somewhere else. > > However, I haven't dismissed the possibility that some common > works related to resource management could be integrated into > executor (or even into planner), but I see none for now. > I also agree with it is "eventually" needed, but may not be supported in the first version. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>
pgsql-hackers by date: