Home > mailing lists

Re: upper planner path-ification - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: upper planner path-ification
Date	May 17, 2015 16:11:52
Msg-id	30896.1431879106@sss.pgh.pa.us Whole thread Raw
In response to	Re: upper planner path-ification (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: upper planner path-ification Re: upper planner path-ification
List	pgsql-hackers

Tree view

Robert Haas <robertmhaas@gmail.com> writes:
> So, getting back to this part, what's the value of returning a list of
> Paths rather than a list of Plans?

(1) less work, since we don't have to fill in details not needed for   costing purposes;
(2) paths carry info that the planner wants but the executor doesn't,   notably sort-order annotations.

> target lists are normally computed when paths are converted to plans,
> but for the higher-level plan nodes adding by grouping_planner, the
> path list is typically just passed in.  So would the new path types be
> expected to carry target lists of their own, or would they need to
> figure out the target list on the fly at plan generation time?

Yeah, that is something I've been struggling with while thinking about
this.  I don't really want to add tlists as such to Paths, but it's
not very clear how else to annotate a Path as to what it computes,
and that seems like an annotation we have to have in some form in order
to convert these planning steps into a Path universe.

There are other cases where it would be useful to have some notion of
this kind.  An example is that right now, if you have an expression index
on an expensive function and a query that wants the value of that function,
the planner is able to extract the value from the index --- but there is
nothing that gives any cost benefit to doing so, so it's just as likely
to choose some other index and eat the cost of recalculating the function.
It seems like the only way to fix that in a principled fashion is to have
some concept that the index-scan Path can produce the function value,
and then when we come to some appropriate costing step, penalize the other
paths for having to compute the value that's available for free from this
one.

Rather than adding tlists per se to Paths, I've been vaguely toying with
a notion of identifying all the "interesting" subexpressions in a query
(expensive functions, aggregates, etc), giving them indexes 1..n, and then
marking Paths with bitmapsets showing which interesting subexpressions
they can produce values for.  This would make things like "does this Path
compute all the needed aggregates" much cheaper to deal with than a raw
tlist representation would do.  But maybe that's still not the best way.

Another point is that a Path that computes aggregates is fundamentally
different from a Path that doesn't, because it doesn't even produce the
same number of rows.  So I'm not at all sure how to visualize the idea
of a Path that computes only some aggregates, or whether it's even a
sensible thing to worry about supporting.

> One thing that seems like it might complicate things here is that a
> lot of planner functions take PlannerInfo *root as an argument.  But
> if we generate only paths in grouping_planner() and path-ify them
> later, the subquery's root will not be available when we're trying to
> do the Path -> Plan transformation.

Ah, you're wrong there, because we hang onto the subquery's root already
(I forget why exactly, but see PlannerGlobal.subroots for SubPlans, and
RelOptInfo.subroot for subquery-in-FROM).  So it would not be a
fundamental problem to postpone create_plan() for a subquery.

> I think grouping_planner() is badly in need of some refactoring just
> to make it shorter.  It's over 1000 lines of code, which IMHO is a
> fairly ridiculous length for a single function.

Amen to that.  But as I said to Andrew, I think this will be a side-effect
of path-ification in this area, and is probably not something to set out
to do first.
        regards, tom lane

pgsql-hackers by date:

From: Tom Lane
Date: 17 May 2015, 15:51:15
Subject: Re: upper planner path-ification

From: "Greg Sabino Mullane"
Date: 17 May 2015, 16:15:30
Subject: Re: Problems with question marks in operators (JDBC, ECPG, ...)

Re: upper planner path-ification - Mailing list pgsql-hackers

Previous

Next