Re: Window Function "Run Conditions" - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: Window Function "Run Conditions" |
Date | |
Msg-id | CAApHDvqtwE4bZxAdk+LQgVkKM+Em-sHybON_OX4YbFs+sQUKWg@mail.gmail.com Whole thread Raw |
In response to | Re: Window Function "Run Conditions" (Andy Fan <zhihui.fan1213@gmail.com>) |
Responses |
Re: Window Function "Run Conditions"
Re: Window Function "Run Conditions" |
List | pgsql-hackers |
On Thu, 26 Aug 2021 at 14:54, Andy Fan <zhihui.fan1213@gmail.com> wrote: > > On Thu, Aug 19, 2021 at 2:35 PM David Rowley <dgrowleyml@gmail.com> wrote: > > > > On Thu, 19 Aug 2021 at 00:20, Andy Fan <zhihui.fan1213@gmail.com> wrote: > > > In the current master, the result is: > > > > > > empno | salary | c | dr > > > -------+--------+---+---- > > > 8 | 6000 | 4 | 1 > > > > > In the patched version, the result is: > > > > > > empno | salary | c | dr > > > -------+--------+---+---- > > > 8 | 6000 | 1 | 1 > > > > Thanks for taking it for a spin. > > > > That's a bit unfortunate. I don't immediately see how to fix it other > > than to restrict the optimisation to only apply when there's a single > > WindowClause. It might be possible to relax it further and only apply > > if it's the final window clause to be evaluated, but in those cases, > > the savings are likely to be much less anyway as some previous > > WindowAgg will have exhausted all rows from its subplan. > > I am trying to hack the select_active_windows function to make > sure the WindowClause with Run Condition clause to be the last one > to evaluate (we also need to consider more than 1 window func has > run condition), at that time, the run condition clause is ready already. > > However there are two troubles in this direction: a). This may conflict > with "the windows that need the same sorting are adjacent in the list." > b). "when two or more windows are order-equivalent then all peer rows > must be presented in the same order in all of them. .. (See General Rule 4 of > <window clause> in SQL2008 - SQL2016.)" > > In summary, I am not sure if it is correct to change the execution Order > of WindowAgg freely. Thanks for looking at that. My current thoughts are that it just feels a little too risky to adjust the comparison function that sorts the window clauses to pay attention to the run-condition. We would need to ensure that there's just a single window function with a run condition as it wouldn't be valid for there to be multiple. It would be easy enough to ensure we only push quals into just 1 window clause, but that and meddling with the evaluation order has trade-offs. To do that properly, we'd likely want to consider the costs when deciding which window clause would benefit from having quals pushed the most. Plus, if we start messing with the evaluation order then we'd likely really want some sort of costing to check if pushing a qual and adjusting the evaluation order is actually cheaper than not pushing the qual and keeping the clauses in the order that requires the minimum number of sorts. The planner is not really geared up for costing things like that properly, that's why we just assume the order with the least sorts is best. In reality that's often not going to be true as an index may exist and we might want to evaluate a clause first if we could get rid of a sort and index scan instead. Sorting the window clauses based on their SortGroupClause is just the best we can do for now at that stage in planning. I think it's safer to just disable the optimisation when there are multiple window clauses. Multiple matching clauses are merged already, so it's perfectly valid to have multiple window functions, it's just they must share the same window clause. I don't think that's terrible as with the major use case that I have in mind for this, the window function is only added to limit the number of rows. In most cases I can imagine, there'd be no reason to have an additional window function with different frame options. I've attached an updated patch.
Attachment
pgsql-hackers by date: