Re: add_path optimization - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: add_path optimization |
Date | |
Msg-id | 603c8f070902011335xa94a333iaff960d000af9f7f@mail.gmail.com Whole thread Raw |
In response to | Re: add_path optimization (Grzegorz Jaskiewicz <gj@pointblue.com.pl>) |
Responses |
Re: add_path optimization
|
List | pgsql-hackers |
On Sun, Feb 1, 2009 at 3:25 PM, Grzegorz Jaskiewicz <gj@pointblue.com.pl> wrote: > I don't like the fact that you hardcoded that here. I know that you are > trying to pass on few calls in one go here, but still... ugly. Well, I think you'll find that using a dynamically sized data structure destroys the possibility of squeezing any additional performance out of this part of the code. The nice thing about fixed-size data structures is that they cost essentially nothing to stack-allocate; you just move the stack pointer and away you go. We should in fact be looking for MORE places where we can avoid the use of constructs like List, since the second-highest CPU hog in my tests was AllocSetAlloc(), beaten out only by add_path(). With this patch applied, AllocSetAlloc() moves up to first. (It would really be rather better to include all the paths generated in each pass of the loop in the call to add_similar_path(), but that looked rather more complicated because we can't predict how many of them there will be, and so adding a fixed-size data structure is not so easy. Plus, the code would all have to be rewritten not to assume that "continue" was the right way to move on to the next iteration of the loop. What would potentially be better still is to try to figure out which nested loop will be the winner without allocating all of the NestPath nodes in the first place, but that didn't seem possible without much more invasive changes, and it's not clear that you would actually still have a winner by the time you got done beating on it.) I am also somewhat mystified as to why using an array of size 5 to hold up to 5 data structure allocated in nearly-consecutive lines of C code would qualify as ugly (completely apart from the fact that it's a clear performance win). > static int > compare_fuzzy_path_costs(Path *path1, Path *path2, int *startup_cost) > { > .... > *startup_cost = (s == 0) ? t : s; > > Why not *startup_cost = s, and let the caller decide which value it wants to > use ? > or just return both, from single call (which would ? > ... > > return t; > } You're essentially suggesting removing logic from compare_fuzzy_path_costs() and duplicating it at the two call sites of that function. If there's a point to that, I don't see it. You might also take a look at the widely used function compare_path_costs(). > To be fair, I don't see compare_fuzzy_path_costs change to save too much of > time in planner. Hmm, well I didn't either, but there's this handy tool called gprof that you might want to try out. I wouldn't be wasting my time patching this part of the code if it didn't make a difference, and in fact if you do 10% of the amount of benchmarking that I did in the process of creating this patch, you will find that it in fact does make a difference. > I would myself probably convert that function into two defines/inline funcs, > but that's just me. It's already static to that .c file, so the compiler likely will inline it. In fact, I suspect you will find that removing the static keyword from the implementation of that function in CVS HEAD is itself sufficient to produce a small but measurable slowdown in planning of large join trees, exactly because it will defeat inlining. ...Robert
pgsql-hackers by date: