Passing more context info to selectivity-estimation code - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Passing more context info to selectivity-estimation code |
Date | |
Msg-id | 20156.1179623477@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: Passing more context info to selectivity-estimation
code
|
List | pgsql-hackers |
I've been thinking about what it will take to solve the problem noted here: http://archives.postgresql.org/pgsql-performance/2007-05/msg00325.php which briefly is that 8.2 is really bad at estimating the number of rows returned by locutions like SELECT ... FROM tab1 LEFT JOIN tab2 ON (tab1.x = tab2.y)WHERE tab2.y IS NULL; the point being to extract just those rows of tab1 that have no join partner in tab2. We frequently recommend this idiom as a better-performing substitute for NOT IN. However, the selectivity of the IS NULL clause is estimated without thought for the effects of the outer join; thus for example if there are no nulls in the actual tab2.y column, the estimate will be that there are no rows in the join result. (Which gets clamped to the minimum estimate of one row, but that's not much help.) If this join is then joined to something else, 8.2 is likely to think it should put the "one-row-out" join on the outside of a nestloop join, yielding horrible performance if there are actually many rows out. Although pre-8.2 releases also failed to consider the effects of outer joins' null insertion, they misestimated this case as returning at least as many result rows as there were in the lefthand input relation (because they didn't distinguish WHERE and JOIN/ON conditions in this context). That's perhaps less likely to give rise to a fatally bad plan. AFAICS the only principled fix for this is to give selectivity estimator functions more knowledge about the context their argument clause appears in. For instance if we restructure the above as SELECT ... FROM tab1 LEFT JOIN tab2 ON (tab1.x = tab2.y AND tab2.y IS NULL); the IS NULL estimator should certainly *not* count the effects of the outer join, even though it's looking at exactly the same clause. So there has to be some context passed in to show which outer joins we are "above" and which not. There already is some context passed to the estimators, in the form of "varRelid" and join-type parameters. In the long term I am inclined to replace these rather ad-hoc parameters with something along the lines of "ClauseContext *context" to carry info about the context in which the estimate needs to be made. This notation would let us add or redefine fields in the ClauseContext struct without having to touch every estimator function again. But it's a bit late to be considering that for 8.3, and it's certainly a nonstarter to think of retrofitting it into 8.2. The only way I can see to fix the problem in 8.2 is to store clause context information within the PlannerInfo data structures. This is pretty grotty, since PlannerInfo is supposed to be global state information for a planner run; but I can't see any need for selectivity estimation code to be re-entrant with respect to a particular planner invocation, so it should work without any problems. The specific hack I'm thinking of is to extend the OuterJoinInfo structures with some fields that would indicate whether the currently considered clause is "above" or "below" each outer join, and further show the estimated percentage of injected nulls for each one we're "above". A traversal of this list would then provide enough knowledge for nulltestsel() or other routines to derive the right answer. Callers of clause_selectivity or clauselist_selectivity would need to ensure that the correct state was set up beforehand. For the moment I'm inclined to teach only nulltestsel() how to use the info. Probably in the long run we'd want all the selectivity estimators to incorporate this refinement, but I'm not sure how good the estimates of null-injection will be; seems prudent not to rely on them everywhere until we get more field experience with it. This is a pretty large and ugly patch to be thinking of back-patching :-(. A quick-hack approach would be to revert this patch: http://archives.postgresql.org/pgsql-committers/2006-11/msg00066.php which would cause 8.2 to uniformly overestimate rather than underestimate the size of the result. That pretty well sucks too, as seen here: http://archives.postgresql.org/pgsql-general/2006-11/msg00472.php but maybe it's less bad than an underestimate. Comments, better ideas? regards, tom lane
pgsql-hackers by date: