Re: fixing consider_parallel for upper planner rels - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: fixing consider_parallel for upper planner rels |
Date | |
Msg-id | CA+Tgmoavc_WGZR3srg8STAk2oNGP4EGmC9503a0PX7_AK9+cAw@mail.gmail.com Whole thread Raw |
In response to | Re: fixing consider_parallel for upper planner rels (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: fixing consider_parallel for upper planner rels
Re: fixing consider_parallel for upper planner rels |
List | pgsql-hackers |
On Thu, Jun 30, 2016 at 5:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> That's pretty much right, but there are two conceptually separate >> things. The first is whether or not we actually attempt to spawn >> workers, and the second is whether or not we enter parallel mode. >> Entering parallel mode enables various restrictions that make spawning >> workers safe, so we cannot spawn workers unless we enter parallel >> mode. We can, however, enter parallel mode without spawning any >> workers, and force_parallel_mode=on does so. The point of that is >> that even if the plan doesn't end up being run inside of a worker, it >> will be run with most of the same restrictions on what it can do that >> would be in place if a truly parallel plan were chosen. > > And the point of that is what, exactly? If the only change is that > "some restrictions get enforced", I am not clear on why we need such > a test mode in cases where the planner is afraid to put a top Gather on > the plan. In particular, given the coding as you now have it, it seems > like the only case where there's any difference is where we set > glob->parallelModeOK but nonetheless end up with a not-parallel-safe > topmost path (that doesn't have a Gather within it). It's not clear > to me why having the executor switch into parallel mode makes sense at > all with such a plan. Suppose you create a PL/pgsql function that does an UPDATE and mark it PARALLEL RESTRICTED. You wonder whether you've marked it correctly. You can set force_parallel_mode=on and SELECT myfunc(). The subsequent ERROR tells you that you've mismarked it. >>> * I'm still not real happy with the has_parallel_hazard test added to >>> apply_projection_to_path, and here is why not: if the target path is >>> not projection-capable, we'll never get to that test. We just pass >>> the problem off to create_projection_path, which looks no further >>> than rel->consider_parallel. It seems like we have to add a >>> has_parallel_hazard test there as well, which is a tad annoying, >>> not least because all the direct callers of create_projection_path >>> seem to have made the test already. > >> Thanks for noticing that problem; I've fixed it by inserting a test >> into create_projection_path. As far as the annoyance factor is >> concerned, you obviously know that there was a flag there to reduce >> the cost of that, but you insisted on removing it. Maybe you should >> consider putting it back. > > No, that flag was on apply_projection_to_path, and I didn't care for it > because most of the callers didn't appear to want to go to the trouble of > setting it correctly. Adding such an argument to create_projection_path > as well doesn't seem to make that better. I'm open to suggestions. As I've noted a few times already, though maybe less clearly than I should have done, I think it's quite likely to be a good idea to try to avoid the overhead of running has_parallel_hazard repeatedly on the same tlists, or for that matter, running it on tlists at all. I don't have any evidence that's expensive enough to cost, just a hunch. Exactly how to do that best, I'm not sure. >> Well, the point of consider_parallel is that there are some things >> that are specific to the individual path, and there are some that are >> properties of the RelOptInfo. It seems highly desirable to check >> things that fall into the latter category exactly once, rather than >> once per path. You seem to be fighting hard against that idea, and >> I'm pretty baffled as to why. > > What I'm not happy about is that as you've got things constituted, > the GetForeignUpperPaths hook is broken so far as injecting parallel paths > is concerned, because the consider_parallel flags for the upperrels > aren't set yet when it gets called. If we keep consider_parallel with > its present usage, we're going to have to refactor things to fix that. I see. It seems to me, and I may be failing to understand something, that the placement of create_upper_paths_hook is substantially better than the placement of GetForeignUpperPaths. If the latter were moved to where the former now is, then consider_parallel would be set sufficiently early and everything would be fine. We could alternatively fish out the code to set consider_parallel for the upper rels and do all of that work before calling that hook. That's a bit hairy, because we'd basically go set all of the consider_parallel flags, then call that hook, then circle back around for the core path generation, but I don't see any intrinsic obstacle to that line of attack. I'm not very sure that one call for all upper rels is going to be convenient, though. >> Updated patch attached. (Official status update: I'll commit this, >> possibly over the long weekend or in any event no later than Tuesday, >> unless there are further review comments before then, in which case I >> will post a further official status update no later than Tuesday.) > > Don't have time to re-read this right now, but maybe tomorrow or > Saturday. OK, thanks. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: