Re: -HEAD planner issue wrt hash_joins on dbt3 ? - Mailing list pgsql-hackers
From | Stefan Kaltenbrunner |
---|---|
Subject | Re: -HEAD planner issue wrt hash_joins on dbt3 ? |
Date | |
Msg-id | 450D2907.3070307@kaltenbrunner.cc Whole thread Raw |
In response to | Re: -HEAD planner issue wrt hash_joins on dbt3 ? (Stefan Kaltenbrunner <stefan@kaltenbrunner.cc>) |
Responses |
Re: -HEAD planner issue wrt hash_joins on dbt3 ?
|
List | pgsql-hackers |
Stefan Kaltenbrunner wrote: > [already sent a variant of that yesterday but it doesn't look like it > made it to the list] > > Tom Lane wrote: >> Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes: >>> Tom Lane wrote: >>>> Apparently we've made the planner a bit too optimistic about the savings >>>> that can be expected from repeated indexscans occurring on the inside of >>>> a join. >>> effective_cache_size was set to 10GB(my fault for copying over the conf >>> from a 16GB box) during the run - lowering it just a few megabytes(!) or >>> to a more realistic 6GB results in the following MUCH better plan: >>> http://www.kaltenbrunner.cc/files/dbt3_explain_analyze2.txt >> Interesting. It used to be that effective_cache_size wasn't all that >> critical... what I think this report is showing is that with the 8.2 >> changes to try to account for caching effects in repeated indexscans, >> we've turned that into a pretty significant parameter. > > took me a while due to hardware issues on my testbox - but there are new > results(with 6GB for effective_cache_size) up at: > > http://www.kaltenbrunner.cc/files/5/ > > there are still a few issues with the validity of the run like the rf > tests not actually being done right - but lowering effective_cache_size > gave a dramtic speedup on Q5,Q7 and Q8. > > that is the explain for the 4h+ Q9: > > http://www.kaltenbrunner.cc/files/analyze_q9.txt > > increasing the the statistic_target up to 1000 does not seem to change > the plan btw. > > disabling nested loop leads to the following (4 times faster) plan: > > http://www.kaltenbrunner.cc/files/analyze_q9_no_nest.txt > > since the hash-joins in there look rather slow (inappropriate hashtable > set up due to the wrong estimates?) I disabled hash_joins too: > > http://www.kaltenbrunner.cc/files/analyze_q9_no_nest_no_hashjoin.txt > > and amazingly this plan is by far the fastest one in runtime (15min vs > 4,5h ...) except that the planner thinks it is 20 times more expensive ... some additional numbers(first one is with default settings, second is with enable_nestloop = 'off', third one is with enable_nestloop = 'off' and enable_hashjoin='off'): http://www.kaltenbrunner.cc/files/analyze_q7.txt here we have a 3x speedup with disabling nested loops and a 2x speedup (over the original plan) with nested loops and hashjoins disabled. http://www.kaltenbrunner.cc/files/analyze_q20.txt here we have a 180x(!) speedup with both disabled planner options ... it is worth mentioning that for both queries the estimated costs in relation to each other looks quite reasonable as soon as enable_nestloop = 'off' (ie 5042928 vs 10715247 with 344sec vs 514 for Q7 and 101441851 vs 101445468 with 10sec vs 11sec) Stefan
pgsql-hackers by date: