Home > mailing lists
Re: Parallel Seq Scan - Mailing list pgsql-hackers

From	Kouhei Kaigai
Subject	Re: Parallel Seq Scan
Date	October 1, 2015 06:36:22
Msg-id	9A28C8860F777E439AA12E8AEA7694F80114D55B@BPXM15GP.gisp.nec.co.jp Whole thread Raw
In response to	Re: Parallel Seq Scan (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: Parallel Seq Scan
List	pgsql-hackers
Tree view
Hi Robert,

Gather node was oversight by readfunc.c, even though it shall not be
serialized actually.
Also, it used incompatible WRITE_xxx_FIELD() macro on outfuncs.c.

The attached patch fixes both of incomsistence.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Robert Haas
> Sent: Wednesday, September 30, 2015 2:19 AM
> To: Amit Kapila
> Cc: Kaigai Kouhei(海外 浩平); Haribabu Kommi; Gavin Flower; Jeff Davis; Andres
> Freund; Amit Langote; Amit Langote; Fabrízio Mello; Thom Brown; Stephen Frost;
> pgsql-hackers
> Subject: Re: [HACKERS] Parallel Seq Scan
> 
> On Tue, Sep 29, 2015 at 12:39 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > Attached patch is a rebased patch based on latest commit (d1b7c1ff)
> > for Gather node.
> >
> > - I have to reorganize the defines in execParallel.h and .c. To keep
> > ParallelExecutorInfo, in GatherState node, we need to include execParallel.h
> > in execnodes.h which was creating compilation issues as execParallel.h
> > also includes execnodes.h, so for now I have defined ParallelExecutorInfo
> > in execnodes.h and instrumentation related structures in instrument.h.
> > - Renamed parallel_seqscan_degree to degree_of_parallelism
> > - Rename Funnel to Gather
> > - Removed PARAM_EXEC parameter handling code, I think we can do this
> > separately.
> >
> > I have to work more on partial seq scan patch for rebasing it and handling
> > review comments for the same, so for now I am sending the first part of
> > patch (which included Gather node functionality and some general support
> > for parallel-query execution).
> 
> Thanks for the fast rebase.
> 
> This patch needs a bunch of cleanup:
> 
> - The formatting for the GatherState node's comment block is unlike
> that of surrounding comment blocks.  It lacks the ------- dividers,
> and the indentation is not the same.  Also, it refers to
> ParallelExecutorInfo by the type name, but the other members by
> structure member name.  The convention is to refer to them by
> structure member name, so please do that.
> 
> - The naming of fs_workersReady is inconsistent with the other
> structure members.  The other members use all lower-case names,
> separating words with dashes, but this one uses a capital letter.  The
> other members also don't prefix the names with anything, but this uses
> a "fs_" prefix which I assume is left over from when this was called
> FunnelState.  Finally, this doesn't actually tell you when workers are
> ready, just whether they were launched.  I suggest we rename this to
> "any_worker_launched".
> 
> - Instead of moving the declaration of ParallelExecutorInfo, please
> just refer to it as "struct ParallelExecutorInfo" in execnodes.h.
> That way, you're not sucking these includes into all kinds of places
> they don't really need to be.
> 
> - Let's not create a new PARALLEL_QUERY category of GUC.  Instead,
> let's the GUC for the number of workers with under resource usage ->
> asynchronous behavior.
> 
> - I don't believe that shm_toc *toc has any business being part of a
> generic PlanState node.  At most, it should be part of an individual
> type of PlanState, like a GatherState or PartialSeqScanState.  But
> really, I don't see why we need it there at all.  It should, I think,
> only be needed during startup to dig out the information we need.  So
> we should just dig that stuff out and keep pointers to whatever we
> actually need - in this case the ParallelExecutorInfo, I think - in
> the particular type of PlanState node that's at issue - here
> GatherState.  After that we don't need a pointer to the toc any more.
> 
> - I'd like to do some renaming of the new GUCs.  I suggest we rename
> cpu_tuple_comm_cost to parallel_tuple_cost and degree_of_parallelism
> to max_parallel_degree.
> 
> - I think that a Gather node should inherit from Plan, not Scan.  A
> Gather node really shouldn't have a scanrelid.  Now, admittedly, if
> the only thing under the Gather is a Partial Seq Scan, it wouldn't be
> totally bonkers to think of the Gather as scanning the same relation
> that the Partial Seq Scan is scanning.  But in any more complex case,
> like where it's scanning a join, you're out of luck.  You'll have to
> set scanrelid == 0, I suppose, but then, for example, ExecScanReScan
> is not going to work.  In fact, as far as I can see, the only way
> nodeGather.c is actually using any of the generic scan stuff is by
> calling ExecInitScanTupleSlot, which is all of one line of code.
> ExecEndGather fetches node->ss.ss_currentRelation but then does
> nothing with it.  So I think this is just a holdover from early
> version of this patch where what's now Gather and PartialSeqScan were
> a single node, and I think we should rip it out.
> 
> - On a related note, the assertions in cost_gather() are both bogus
> and should be removed. Similarly with create_gather_plan().  As
> previously mentioned, the Gather node should not care what sort of
> thing is under it; I am not interested in restricting it to baserels
> and then undoing that later.
> 
> - For the same reason, cost_gather() should refer to it's third
> argument as "rel" not "baserel".
> 
> - Also, I think this stuff about physical tlists in
> create_gather_plan() is bogus.  use_physical_tlist is ignorant of the
> possibility that the RelOptInfo passed to it might be anything other
> than a baserel, and I think it won't be happy if it gets a joinrel.
> Moreover, I think our plan here is that, at least for now, the
> Gather's tlist will always match the tlist of its child.  If that's
> so, there's no point to this: it will end up with the same tlist
> either way.  If any projection is needed, it should be done by the
> Gather node's child, not the Gather node itself.
> 
> - Let's rename DestroyParallelSetupAndAccumStats to
> ExecShutdownGather.  Instead of encasing the entire function in if
> statement, let's start with if (node->pei == NULL || node->pei->pcxt
> == NULL) return.
> 
> - ExecParallelBufferUsageAccum should be declared to take an argument
> of type PlanState, not Node.  Then you don't have to cast what you are
> passing to it, and it doesn't have to cast before calling itself. And,
> let's also rename it to ExecShutdownNode and move it to
> execProcnode.c.  Having a "shutdown phase" that stops a node from
> asynchronously consuming additional resources could be useful for
> non-parallel node types - especially ForeignScan and CustomScan.  And
> we could eventually extend this to be called in other situations, like
> when a Limit is filled give everything beneath it a chance to ease up.
> We don't have to do those bits of work right now but it seems well
> worth making this look like a generic facility.
> 
> - Calling DestroyParallelSetupAndAccumStats from ExplainNode when we
> actually reach the Gather node is much too late.  We should really be
> shutting down parallel workers at the end of the ExecutorRun phase, or
> certainly no later than ExecutorFinish.  In fact, you have
> standard_ExecutorRun calling ExecParallelBufferUsageAccum() but only
> if queryDesc->totaltime is set.  What I think you should do instead is
> call ExecShutdownNode a few lines earlier, before shutting down the
> tuple receiver, and do so unconditionally.  That way, the workers are
> always shut down in the ExecutorRun phase, which should eliminate the
> need for this bit in explain.c.
> 
> - The changes to postmaster.c and postgres.c consist of only
> additional #includes.  Those can, presumably, be reverted.
> 
> Other than that, hah hah, it looks pretty cool.
> 
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
> 
> 
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
Attachment

pgsql-gather-on-readfunc.v1.patch
pgsql-hackers by date:
From: Josh Berkus
Date: 01 October 2015, 04:02:31
Subject: Re: Idea for improving buildfarm robustness
From: Kyotaro HORIGUCHI
Date: 01 October 2015, 06:38:22
Subject: Re: Foreign join pushdown vs EvalPlanQual
Re: Parallel Seq Scan - Mailing list pgsql-hackers

Attachment

Previous

Next