Re: Clamping reulst row number of joins. - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Clamping reulst row number of joins. |
Date | |
Msg-id | 7015.1425678120@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Clamping reulst row number of joins. (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Clamping reulst row number of joins.
|
List | pgsql-hackers |
Tom Lane <tgl@sss.pgh.pa.us> writes: > Stephen Frost <sfrost@snowman.net> writes: >> * Tom Lane (tgl@sss.pgh.pa.us) wrote: >>> BTW, is that JOIN (VALUES(...)) thing common in applications, or did you >>> just use it to make a compact example? If it were something worth >>> optimizing, it seems like we could teach the planner to "pull up VALUES" >>> in the same way that it flattens sub-selects. I'm not sure if this is >>> worth the trouble or not, though. >> I've certainly seen and used values() constructs in joins for a variety >> of reasons and I do think it'd be worthwhile for the planner to know how >> to pull up a VALUES construct. >> Would that be a lot of effort, either code-wise or runtime-wise? My gut >> feeling is that it wouldn't be, but you're clearly in a much better >> position to determine that. > My guess is that it'd be pretty simple to do if we want to do it. > I've not looked at the code yet though. I spent a bit of time looking at this, and realized that the blocker is the same as the reason why we don't pull up sub-selects with empty rangetables ("SELECT expression(s)"). Namely, that the upper query's jointree can't handle a null subtree. (This is not only a matter of the jointree data structure, but the fact that the whole planner identifies relations by bitmapsets of RTE indexes, and subtrees with empty RTE sets couldn't be told apart.) We could probably fix both cases for order-of-a-hundred lines of new code in prepjointree. The plan I'm thinking about is to allow such vacuous subquery jointrees to be pulled up, but only if they are in a place in the upper query's jointree where it's okay to delete the subtree. This would basically be two cases: (1) the immediate parent is a FromExpr that would have at least one remaining child, or (2) the immediate parent is an INNER JOIN whose other child isn't also being deleted (so that we can convert the JoinExpr to a nonempty FromExpr, or just use the other child as-is if the JoinExpr has no quals). More generally, it occurs to me that maybe Oracle wasn't being totally silly when they invented DUAL. If we had a jointree representation for "dummy relation with exactly one row" then we could substitute that in all vacuous-jointree cases. However, it's not clear that there's any functional advantage to replacing a VALUES Scan with a "DUAL Scan", which is basically what would happen if we flattened a VALUES and then had to put one of these things in instead. And having such things propagate all through the planner, executor, EXPLAIN, etc is way more code churn than I want to contemplate right now. The plan proposed in the preceding para is a bit more ugly logically, but it would limit the code effects to basically pull_up_subqueries() and its child routines. regards, tom lane
pgsql-hackers by date: