Re: Parallel Seq Scan - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Parallel Seq Scan |
Date | |
Msg-id | CA+TgmobPnm9UXTLEsrR1b65-U0a2pOPR8aakP7G5VpeJ8u8YPA@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel Seq Scan (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Parallel Seq Scan
Re: Parallel Seq Scan |
List | pgsql-hackers |
On Tue, Mar 3, 2015 at 7:47 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: > I have modified the patch to introduce a Funnel node (and left child > as PartialSeqScan node). Apart from that, some other noticeable > changes based on feedback include: > a) Master backend forms and send the planned stmt to each worker, > earlier patch use to send individual elements and form the planned > stmt in each worker. > b) Passed tuples directly via tuple queue instead of going via > FE-BE protocol. > c) Removed restriction of expressions in target list. > d) Introduced a parallelmodeneeded flag in plannerglobal structure > and set it for Funnel plan. > > There is still some work left like integrating with > access-parallel-safety patch (use parallelmodeok flag to decide > whether parallel path can be generated, Enter/Exit parallel mode is still > done during execution of funnel node). > > I think these are minor points which can be fixed once we decide > on the other major parts of patch. Find modified patch attached with > this mail. This is definitely progress. I do think you need to integrate it with the access-parallel-safety patch. Other comments: - There's not much code left in shmmqam.c. I think that the remaining logic should be integrated directly into nodeFunnel.c, with the two bools in worker_result_state becoming part of the FunnelState. It doesn't make sense to have a separate structure for two booleans and 20 lines of code. If you were going to keep this file around, I'd complain about its name and its location in the source tree, too, but as it is I think we can just get rid of it altogether. - Something is deeply wrong with the separation of concerns between nodeFunnel.c and nodePartialSeqscan.c. nodeFunnel.c should work correctly with *any arbitrary plan tree* as its left child, and that is clearly not the case right now. shm_getnext() can't just do heap_getnext(). Instead, it's got to call ExecProcNode() on its left child and let the left child decide what to do about that. The logic in InitFunnelRelation() belongs in the parallel seq scan node, not the funnel. ExecReScanFunnel() cannot be calling heap_parallel_rescan(); it needs to *not know* that there is a parallel scan under it. The comment in FunnelRecheck is a copy-and-paste from elsewhere that is not applicable to a generic funnel mode. - The comment in execAmi.c refers to says "Backward scan is not suppotted for parallel sequiantel scan". "Sequential" is mis-spelled here, but I think you should just nuke the whole comment. The funnel node is not, in the long run, just for parallel sequential scan, so putting that comment above it is not right. If you want to keep the comment, it's got to be more general than that somehow, like "parallel nodes do not support backward scans", but I'd just drop it. - Can we rename create_worker_scan_plannedstmt to create_parallel_worker_plannedstmt? - I *strongly* suggest that, for the first version of this, we remove all of the tts_fromheap stuff. Let's make no special provision for returning a tuple stored in a tuple queue; instead, just copy it and store it in the slot as a pfree-able tuple. That may be slightly less efficient, but I think it's totally worth it to avoid the complexity of tinkering with the slot mechanism. - InstrAggNode claims that we only need the master's information for statistics other than buffer usage and tuple counts, but is that really true? The parallel backends can be working on the parallel part of the plan while the master is doing something else, so the amount of time the *master* spent in a particular node may not be that relevant. We might need to think carefully about what it makes sense to display in the EXPLAIN output in parallel cases. - The header comment on nodeFunnel.h capitalizes the filename incorrectly. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: