Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not? - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not? |
Date | |
Msg-id | CA+TgmoaC9zGwxhoi6o0gAGXXBrkZxf+GRokNtaN9AmjjhD8j1g@mail.gmail.com Whole thread Raw |
In response to | Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not? (Andy Fan <zhihui.fan1213@gmail.com>) |
Responses |
Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?
Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not? |
List | pgsql-hackers |
On Tue, Feb 1, 2022 at 10:08 AM Andy Fan <zhihui.fan1213@gmail.com> wrote: > To address the row estimation issue, The most straightforward way to fix this is to > ignore the derived clauses when figuring out the RelOptInfo->rows on base relation. > To note which clause is derived from this patch, I added a new field "EquivalenceClass * > derived" in RestrictInfo. and then added a included_derived option in clauselist_selectivity_ext, > during the set_xxx_rel_size function, we can pass the included_derived=false. This strategy > should be used in get_parameterized_baserel_size. In all the other cases, include_derived=true > is used. which are finished in commit 2. (Commit 1 is Daivd's patch, I just rebased it) That doesn't sound correct to me. Suppose that we have A.x = B.x and also A.x < 42. We can choose to enforce A.x < 42 or we can choose to enforce B.x < 42 or we can do both. In general, any of those could be right: if either one of those two is highly selective while the other is not very selective at all, it's going to be fastest to enforce only the more selective qual. But if both are selective then it may be best to enforce both, so let's suppose we do that. If we don't adopt the proposal above and just do nothing, then our row count estimates for both A and B will include the effect of checking x < 42, and so they will be correct, but the row count estimate for join(A, B) will include the effect of checking x < 42 twice, and so it will be too low, which can mess up the plan at higher levels. But discounting the effect of B.x < 42 when estimating the size of B is also incorrect. Now, the row count estimate for join(A, B) will include the effect of x < 42 only once, which is good. However, the row count estimate for B will be too high, because it will not include the effect of B.x < 42. And that means that the cost estimate for join(A, B) will be wrong. It will be too high, because it's going to think that it has more rows coming from the B side of the join than what is actually the case. And that can also mess up the plan at higher levels. I think we could get around this problem by having multiple RelOptInfos (or something similar that is lighter-weight) for each relation. Today, we'd create a RelOptInfo for A, one for B, and one for join(A, B), and the paths for the join are created by joining a path for A to a path for B. Now imagine that we have instead 5 RelOptInfos, for {A}, {A|x<42}, {B}, {B|x<42}, and join(A, B). The legal paths for that last one can be created by joining {A} to {B|x<42} or {A|x<42} to {B} or {A|x<42} to {B|x<42}. Each of those 5 RelOptInfos can have its own cardinality estimate, and it seems pretty straightforward to see how to get both the scan cardinality and the join cardinality correct. Now I think this is decidedly non-trivial to implement, and I also hear the voice of Tom Lane saying that's going to be expensive in both time and memory, and he's not wrong. On the other hand, I completely agree with David's comments on the other thread to the effect that holding our breath is not getting us anywhere. People don't keep asking for this feature because it's a stupid thing that nobody really wants, and when Tom alleges that it will rarely pay off, I think he's pretty far off the mark. The only time we need to consider doing any extra work is when we have something like the example discussed here, namely A.x = B.x and A.x < 42. If there is a variable that is part of an equivalence class and also is used in a scan qual, what are the chances that the implied inequality is useful? There's no way to estimate that mathematically - it's all about what you think human beings are typically going to do - but I'd say it's probably better than 50%. I know that when I was regularly doing application programming on top of PostgreSQL I was VERY aware of this limitation of the optimizer and habitually thought about which table to write the inequality against. That kept me out of trouble most of the time, but it sure seems like we're punting the optimizer's job to the end user. And even then, I still sometimes couldn't stay out of trouble, because sometimes I knew that the implied inequality really ought to be enforced against both sides of the join to get a decent plan. In that case, the only way to get the optimizer to do what I wanted was to duplicate the qual. But that runs headlong into the exact problem that we're talking about here: now the join selectivity is going to be messed up, and then some other part of the plan would get messed up. I still remember the frustration associated with that scenario more than 10 years later. You can't even fix it by uglifying your query with a planner hint, because we don't support those either. Which brings me to another point: it's incoherent to simultaneously argue that we shouldn't have planner hints but rather focus on improving the planner, and at the same time refuse to improve the planner because it would make planning too expensive. I actually think we should do both, because I neither believe that it's impossible to fix this particular problem nor that it is possible to create a planner so good that it always makes the right decisions without any explicit input from a human being. But the only way you can think this problem is unfixable and at the same time think we don't need hints is if you think this problem is fake. It's not. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: