Re: [HACKERS] [PATCH] Incremental sort - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: [HACKERS] [PATCH] Incremental sort |
Date | |
Msg-id | 861f7398-59fd-2aeb-964d-38b150687f25@2ndquadrant.com Whole thread Raw |
In response to | Re: [HACKERS] [PATCH] Incremental sort (Alexander Korotkov <a.korotkov@postgrespro.ru>) |
Responses |
Re: [HACKERS] [PATCH] Incremental sort
|
List | pgsql-hackers |
On 04/06/2018 01:43 AM, Alexander Korotkov wrote: > Hi! > > On Tue, Apr 3, 2018 at 2:10 PM, Tomas Vondra > <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote: > > I think solving this may be fairly straight-forward. Essentially, until > now we only had one way to do the sort, so it was OK to make the sort > implicit by checking if the path is sorted > > if (input not sorted) > { > ... add a Sort node ... > } > > But now we have multiple possible ways to do the sort, with different > startup/total costs. So the places that create the sorts need to > actually generate the Sort paths for each sort alternative, and store > the information in the Sort node (instead of relying on pathkeys). > > Ultimately, this should simplify the createplan.c places making all the > make_sort calls unnecessary (i.e. the input should be already sorted > when needed). Otherwise it'd mean the decision needs to be done locally, > but I don't think that should be needed. > > But it's surely a fairly invasive change to the patch ... > > > Right, there are situation when incremental sort has lower startup cost, > but higher total cost. In order to find lower cost, we ideally should > generate > paths for both full sort and incremental sort. However, that would increase > number of total pathes, and could slowdown planning time. Another issue > that we don't always generate pathes for sort. And yes, it would be rather > invasive. So, that doesn't look feasible to get into 11. > I agree that's probably not feasible for PG11, considering the freeze is about 48h from now. Not necessarily because of amount of code needed to do that (it might be fairly small, actually) but because of the risk of regressions in other types of plans and lack of time for review/testing. I do not think this would cause a significant increase in path number. We already do have the (partially sorted) paths in pathlist, otherwise v21 wouldn't be able to build the incremental sort path anyway. And the plans that did the decision in createplan.c could fairly easily do the decision when constructing the path, I believe. Looking v21, this affects three different places: 1) create_merge_append_plan For merge_append the issue is that generate_mergeappend_paths() calls get_cheapest_path_for_pathkeys(), which however only looks at cheapest startup/total path, and then simply falls-back to cheapest_total_path if there are no suitably sorted paths. IMHO if you modify this to also consider partially-sorted paths, it should work. You'll have to account for the extra cost of incremental cost, and it needs to be fairly cheap (perhaps even by first quickly computing some initial cost estimate - see e.g. initial_cost_nestloop/final_cost_nestloop). 2) create_mergejoin_plan For mergejoin, the issue is that sort_inner_and_outer() only looks at cheapest_total_path for both sides, even before computing the merge pathkeys. So some of the code would have to move into the foreach loop, and the paths would be picked by get_cheapest_path_for_pathkeys(). This time only using total cost, per the comment in sort_inner_and_outer(). 3) create_gather_merge_plan This seems fairly simple - the issue is that gather_grouping_paths() only looks at cheapest_partial_path. Should be a matter of simply calling the improved get_cheapest_path_for_pathkeys(). Of course, this is all fairly hand-wavy and it may turn out to be fairly expensive in some cases. But we can use another trick - we don't need to search through all partially sorted paths, because for each pathkey prefix there can be just one "best" path for startup cost and one for total cost. So we could maintain a much shorter list of partially sorted paths, I think. > > Intead, I decided to cut usage of incremental sort. Now, incremental sort > is generated only in create_sort_path(). Cheaper path selected between > incremental sort and full sort with taking limit_tuples into account. > That limits usage of incremental sort, however risk of regression by this > patch is also minimal. In fact, incremental sort will be used only when > sort is explicitly specified and simultaneously LIMIT is specified or > dataset to be sorted is large and incremental sort saves disk IO. > I personally am OK with reducing the scope of the patch like this. It's still beneficial for the common ORDER BY + LIMIT case, which is good. I don't think it may negatively affect other cases (at least I can't think of any). It's pretty obvious it may be extremely useful for the other cases too (particularly for mergejoin on large tables, where it can allow in-memory sort with quick startup). But even if you managed to make the necessary code changes, it's unlikely any experienced committer will look into such significant change this close to the cutoff. Either they are going to be away for a weekend, or they are already looking at other patches :-( > Attached patch also incorporates following commits made by Alexander > Kuzmenkov: > * Rename fields of IncrementalSortState to snake_case for the sake of > consistency. > * Rename group test function to isCurrentGroup. > * Comments from Tomas Vondra about nodeIncrementalSort.c > * Add a test for incremental sort. > * Add a separate function to calculate costs of incremental sort. > Those changes seem fine, but are still a couple of issues remaining: 1) pathkeys_useful_for_ordering() still uses enable_incrementalsort, which I think is a bad idea. I've complained about it in my review on 31/3, and I don't see any explanation why this is a good idea. 2) Likewise, I've suggested that the claim about abbreviated keys in nodeIncrementalsort.c is dubious. No response, and the XXX comment was instead merged into the patch: * XXX The claim about abbreviated keys seems rather dubious, IMHO. 3) There is a comment at cost_merge_append, despite there being no relevant changes in that function. Misplaced comment? 4) It's not clear to me why INITIAL_MEMTUPSIZE is defined the way it is. There needs to be a comment - the intent seems to be making it large enough to exceed ALLOCSET_SEPARATE_THRESHOLD, but it's not quite clear why that's a good idea. 5) I do get this warning when building the code: costsize.c: In function ‘cost_incremental_sort’: costsize.c:1812:2: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] List *presortedExprs = NIL; ^~~~ 6) The comment at cost_incremental_sort talks about cost_sort_internal, but it's cost_sort_tuplesort I guess. 7) The new code in create_sort_path is somewhat ugly, I guess. It's correct, but it really needs to be made easier to comprehend. I might have time to look into that tomorrow, but I can't promise that. Attached is a diff highlighting some of those places, and couple of minor code formatting fixes. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
pgsql-hackers by date: