Re: review: More frame options in window functions - Mailing list pgsql-hackers
From | Hitoshi Harada |
---|---|
Subject | Re: review: More frame options in window functions |
Date | |
Msg-id | e08cc0401001161326x14bdf398l6fa16186fb298e2a@mail.gmail.com Whole thread Raw |
In response to | Re: review: More frame options in window functions ("Erik Rijkers" <er@xs4all.nl>) |
Responses |
Re: review: More frame options in window functions
|
List | pgsql-hackers |
2010/1/17 Erik Rijkers <er@xs4all.nl>: > On Sat, January 16, 2010 09:29, Hitoshi Harada wrote: >> 2010/1/16 Erik Rijkers <er@xs4all.nl>: >>> >>>> Thanks for the review. I've found another crash today and attached is >>>> fixed version. The case is: >>>> >>>> SELECT four, sum(ten) over (PARTITION BY four ORDER BY four RANGE 1 >>>> PRECEDING) FROM tenk1 WHERE unique1 < 10; >>>> >>> >>> Hi, >>> >>> The patch (more_frame_options.20100115.patch.gz) applies cleanly, but the regression test gives: >> >> It doesn't happen here. Could you send more information like configure >> options and EXPLAIN output of that query? >> > > Sorry, I should have done that the first time. Here is more info: > > Linux 2.6.18-164.el5 x86_64 / CentOS release 5.4 (Final) Thanks, I confirmed it on my 64bit environment. My approach to solve this was completely wrong. The problem here is that when PARTITION BY clause equals to ORDER BY clause window_pathkeys is canonicalized in make_pathkeys_for_window() and so get_column_info_for_window() recognizes number of ORDER BY columns as 0, in other word, window->ordNumCols = 0 && window->ordColIdx[0] is invalid. This behavior is ok in 8.4 because in that case all rows are peer to each other and the partition = the frame in RANGE mode. This assumption is now broken since FRAMEOPTION_START_OFFSET and FRAMEOPTION_END_OFFSET are introduced, which means that the current row can be out of the frame. So with these options we must always evaluate if the current row is inside the frame or not by *sort key column*. However, we don't have it in the executor as the planner has removed it during canonicalization of pathkeys. So I previously added such code: *** 2819,2825 **** get_column_info_for_window(PlannerInfo *root, WindowClause *wc, List *tlist, int numPart = list_length(wc->partitionClause); int numOrder= list_length(wc->orderClause); ! if (numSortCols == numPart + numOrder) { /* easy case */ *partNumCols = numPart; --- 2836,2847 ---- int numPart = list_length(wc->partitionClause); int numOrder = list_length(wc->orderClause); ! /* ! * in RANGE offset mode, numOrder should be represented as-is. ! */ ! if (numSortCols == numPart + numOrder || ! (wc->frameOptions & FRAMEOPTION_RANGE && ! wc->frameOptions & (FRAMEOPTION_START_VALUE | FRAMEOPTION_END_VALUE))) { /* easy case */ *partNumCols= numPart; but it failed now, since the "sortColIdx" passed in has been canonicalized already. And I tried to change not to canonicalize the pathkeys in make_pathkeys_window() in such cases and succeeded then passed all regression tests. diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index 6ba051d..fee1302 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -1417,11 +1417,17 @@ grouping_planner(PlannerInfo *root, double tuple_fraction) int ordNumCols; AttrNumber *ordColIdx; Oid *ordOperators; + bool canonicalize; + + canonicalize = !(wc->frameOptions & FRAMEOPTION_RANGE && + wc->frameOptions & + (FRAMEOPTION_START_VALUE | + FRAMEOPTION_END_VALUE)); window_pathkeys = make_pathkeys_for_window(root, wc, tlist, - true); + canonicalize); /* * This is a bit tricky: we build a sort node even if we don't I am not very sure if this is the correct answer. Thoughts? Regards, -- Hitoshi Harada
pgsql-hackers by date: