Home > mailing lists

Re: fixing consider_parallel for upper planner rels - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: fixing consider_parallel for upper planner rels
Date	June 30, 2016 22:33:41
Msg-id	CA+Tgmoavc_WGZR3srg8STAk2oNGP4EGmC9503a0PX7_AK9+cAw@mail.gmail.com Whole thread Raw
In response to	Re: fixing consider_parallel for upper planner rels (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: fixing consider_parallel for upper planner rels Re: fixing consider_parallel for upper planner rels
List	pgsql-hackers

Tree view

On Thu, Jun 30, 2016 at 5:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> That's pretty much right, but there are two conceptually separate
>> things.  The first is whether or not we actually attempt to spawn
>> workers, and the second is whether or not we enter parallel mode.
>> Entering parallel mode enables various restrictions that make spawning
>> workers safe, so we cannot spawn workers unless we enter parallel
>> mode.  We can, however, enter parallel mode without spawning any
>> workers, and force_parallel_mode=on does so.  The point of that is
>> that even if the plan doesn't end up being run inside of a worker, it
>> will be run with most of the same restrictions on what it can do that
>> would be in place if a truly parallel plan were chosen.
>
> And the point of that is what, exactly?  If the only change is that
> "some restrictions get enforced", I am not clear on why we need such
> a test mode in cases where the planner is afraid to put a top Gather on
> the plan.  In particular, given the coding as you now have it, it seems
> like the only case where there's any difference is where we set
> glob->parallelModeOK but nonetheless end up with a not-parallel-safe
> topmost path (that doesn't have a Gather within it).  It's not clear
> to me why having the executor switch into parallel mode makes sense at
> all with such a plan.

Suppose you create a PL/pgsql function that does an UPDATE and mark it
PARALLEL RESTRICTED.  You wonder whether you've marked it correctly.
You can set force_parallel_mode=on and SELECT myfunc().  The
subsequent ERROR tells you that you've mismarked it.

>>> * I'm still not real happy with the has_parallel_hazard test added to
>>> apply_projection_to_path, and here is why not: if the target path is
>>> not projection-capable, we'll never get to that test.  We just pass
>>> the problem off to create_projection_path, which looks no further
>>> than rel->consider_parallel.  It seems like we have to add a
>>> has_parallel_hazard test there as well, which is a tad annoying,
>>> not least because all the direct callers of create_projection_path
>>> seem to have made the test already.
>
>> Thanks for noticing that problem; I've fixed it by inserting a test
>> into create_projection_path.  As far as the annoyance factor is
>> concerned, you obviously know that there was a flag there to reduce
>> the cost of that, but you insisted on removing it.  Maybe you should
>> consider putting it back.
>
> No, that flag was on apply_projection_to_path, and I didn't care for it
> because most of the callers didn't appear to want to go to the trouble of
> setting it correctly.  Adding such an argument to create_projection_path
> as well doesn't seem to make that better.

I'm open to suggestions.  As I've noted a few times already, though
maybe less clearly than I should have done, I think it's quite likely
to be a good idea to try to avoid the overhead of running
has_parallel_hazard repeatedly on the same tlists, or for that matter,
running it on tlists at all.  I don't have any evidence that's
expensive enough to cost, just a hunch.  Exactly how to do that best,
I'm not sure.

>> Well, the point of consider_parallel is that there are some things
>> that are specific to the individual path, and there are some that are
>> properties of the RelOptInfo.  It seems highly desirable to check
>> things that fall into the latter category exactly once, rather than
>> once per path.  You seem to be fighting hard against that idea, and
>> I'm pretty baffled as to why.
>
> What I'm not happy about is that as you've got things constituted,
> the GetForeignUpperPaths hook is broken so far as injecting parallel paths
> is concerned, because the consider_parallel flags for the upperrels
> aren't set yet when it gets called.  If we keep consider_parallel with
> its present usage, we're going to have to refactor things to fix that.

I see.  It seems to me, and I may be failing to understand something,
that the placement of create_upper_paths_hook is substantially better
than the placement of GetForeignUpperPaths.  If the latter were moved
to where the former now is, then consider_parallel would be set
sufficiently early and everything would be fine.  We could
alternatively fish out the code to set consider_parallel for the upper
rels and do all of that work before calling that hook.  That's a bit
hairy, because we'd basically go set all of the consider_parallel
flags, then call that hook, then circle back around for the core path
generation, but I don't see any intrinsic obstacle to that line of
attack.  I'm not very sure that one call for all upper rels is going
to be convenient, though.

>> Updated patch attached.   (Official status update: I'll commit this,
>> possibly over the long weekend or in any event no later than Tuesday,
>> unless there are further review comments before then, in which case I
>> will post a further official status update no later than Tuesday.)
>
> Don't have time to re-read this right now, but maybe tomorrow or
> Saturday.

OK, thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Robert Haas
Date: 30 June 2016, 22:17:27
Subject: Re: fixing consider_parallel for upper planner rels

From: Robert Haas
Date: 30 June 2016, 22:46:50
Subject: Re: parallel workers and client encoding

Re: fixing consider_parallel for upper planner rels - Mailing list pgsql-hackers

Previous

Next