Re: Major performance problem after upgrade from 8.3 to 8.4 - Mailing list pgsql-performance
From | Gerhard Wiesinger |
---|---|
Subject | Re: Major performance problem after upgrade from 8.3 to 8.4 |
Date | |
Msg-id | alpine.LFD.2.01.1009130838530.12288@bbs.intern Whole thread Raw |
In response to | Re: Major performance problem after upgrade from 8.3 to 8.4 (Gerhard Wiesinger <lists@wiesinger.com>) |
Responses |
Re: Major performance problem after upgrade from 8.3 to 8.4
|
List | pgsql-performance |
Hello, Any news or ideas regarding this issue? Thnx. Ciao, Gerhard -- http://www.wiesinger.com/ On Sat, 4 Sep 2010, Gerhard Wiesinger wrote: > On Fri, 3 Sep 2010, Tom Lane wrote: > >> Gerhard Wiesinger <lists@wiesinger.com> writes: >>> 8.3 query plans: http://www.wiesinger.com/tmp/pg_perf_83_new.txt >>> 8.4 quey plans: http://www.wiesinger.com/tmp/pg_perf_84.txt >> >> Hmm. The 8.3 plan is indeed assuming that the number of rows will stay >> constant as we bubble up through the join levels, but AFAICS this is >> simply wrong: >> >> -> Nested Loop Left Join (cost=0.00..38028.89 rows=67 width=8) >> -> Nested Loop Left Join (cost=0.00..25399.46 rows=67 width=8) >> -> Nested Loop Left Join (cost=0.00..12770.04 rows=67 >> width=8) >> -> Index Scan using i_log_unique on log l >> (cost=0.00..140.61 rows=67 width=8) >> Index Cond: (datetime >= (now() - '00:01:00'::interval)) >> -> Index Scan using unique_key_and_id on log_details d7 >> (cost=0.00..187.39 rows=89 width=8) >> Index Cond: ((l.id = d7.fk_id) AND (d7.fk_keyid = $6)) >> -> Index Scan using unique_key_and_id on log_details d6 >> (cost=0.00..187.39 rows=89 width=8) >> Index Cond: ((l.id = d6.fk_id) AND (d6.fk_keyid = $5)) >> -> Index Scan using unique_key_and_id on log_details d5 >> (cost=0.00..187.39 rows=89 width=8) >> Index Cond: ((l.id = d5.fk_id) AND (d5.fk_keyid = $4)) >> >> If the log_details indexscans are expected to produce 89 rows per >> execution, then surely the join size should go up 89x at each level, >> because the join steps themselves don't eliminate anything. >> >> In 8.4 the arithmetic is at least self-consistent: >> >> -> Nested Loop Left Join (cost=0.00..505256.95 rows=57630 >> width=8) >> -> Nested Loop Left Join (cost=0.00..294671.96 rows=6059 >> width=8) >> -> Nested Loop Left Join (cost=0.00..272532.55 rows=637 >> width=8) >> -> Index Scan using log_pkey on log l >> (cost=0.00..270203.92 rows=67 width=8) >> Filter: (datetime >= (now() - '00:01:00'::interval)) >> -> Index Scan using unique_key_and_id on log_details d7 >> (cost=0.00..34.63 rows=10 width=8) >> Index Cond: ((l.id = d7.fk_id) AND (d7.fk_keyid = $6)) >> -> Index Scan using unique_key_and_id on log_details d6 >> (cost=0.00..34.63 rows=10 width=8) >> Index Cond: ((l.id = d6.fk_id) AND (d6.fk_keyid = $5)) >> -> Index Scan using unique_key_and_id on log_details d5 >> (cost=0.00..34.63 rows=10 width=8) >> Index Cond: ((l.id = d5.fk_id) AND (d5.fk_keyid = $4)) >> >> The rowcount estimates are apparently a shade less than 10, but they get >> rounded off in the display. >> >> I believe the reason for this change is that 8.4's join estimation code >> was rewritten so that it wasn't completely bogus for outer joins. 8.3 >> might have been getting the right answer, but it was for the wrong >> reasons. >> >> So the real question to be answered here is why doesn't it think that >> each of the unique_key_and_id indexscans produce just a single row, as >> you indicated was the case. The 8.4 estimate is already a factor of >> almost 10 closer to reality than 8.3's, but you need another factor of >> 10. You might find that increasing the statistics target for the >> log_details table helps. > > > Ok, Tom, tried different things (more details are below): > 1.) Setting statistic target to 1000 and 10000 (without success), still merge > join > 2.) Tried to added a Index on description to help the planner for uniqueness > (without success) > 3.) Forced the planner to use nested loop joins (SUCCESS): > SET enable_hashjoin=false;SET enable_mergejoin=false; > (BTW: How do use such settings in Java and PHP and Perl, is there a command > available?) > > Open questions: > Why does the planner not choose nested loop joins, that should be the optimal > one for that situation? > Does the planner value: a.) UNIQUENESS b.) UNIQUENESS and NOT NULLs? > Any ideas for improvement of the planner? > > Details: > -- CREATE UNIQUE INDEX unique_key_and_id ON log_details USING btree (fk_id, > fk_keyid) > -- 1000 and 10000 didn't help for better query plan for Nested Loop Left > Join, still Merge Left Join > -- Sample with: > -- ALTER TABLE log_details ALTER COLUMN fk_id SET STATISTICS 10000; > -- ALTER TABLE log_details ALTER COLUMN fk_keyid SET STATISTICS 10000; > -- ANALYZE VERBOSE log_details; > -- Still Merge Join: > -- -> Merge Left Join (cost=9102353.88..83786934.25 rows=2726186787 > width=16) > -- Merge Cond: (l.id = d2000902.fk_id) > -- -> Merge Left Join (cost=8926835.18..40288402.09 rows=972687282 > width=24) > -- Merge Cond: (l.id = d2000904.fk_id) > -- Default values again > ALTER TABLE log_details ALTER COLUMN fk_id SET STATISTICS 100; > ALTER TABLE log_details ALTER COLUMN fk_keyid SET STATISTICS 100; > ANALYZE VERBOSE log_details; > > -- Tried to add WITHOUT SUCCESS (that planner could know that description is > NOT NULL and UNIQE) > DROP INDEX IF EXISTS i_key_description_desc; > CREATE UNIQUE INDEX i_key_description_desc ON key_description (description); > -- Therefore planner should know: keyid is NOT NULL and UNIQUE and only one > result: (SELECT keyid FROM key_description WHERE description = > 'Raumsolltemperatur') > -- Therefore from constraint planner should know that fk_id is NOT NULL and > UNIQUE: CONSTRAINT unique_key_and_id UNIQUE(fk_id, fk_keyid): > -- LEFT JOIN log_details d1 ON l.id = d1.fk_id AND > -- d1.fk_keyid = (SELECT keyid FROM key_description WHERE description = > 'Raumsolltemperatur') > -- Does the planner value alls those UNIQUEnesses and NOT NULLs? > > -- Again back to 8.3 query plan which is fast (319ms): > SET enable_hashjoin=false; > SET enable_mergejoin=false; > -- -> Nested Loop Left Join (cost=0.00..22820970510.45 rows=2727492136 > width=16) > -- -> Nested Loop Left Join (cost=0.00..12810087616.29 > rows=973121653 width=24) > -- -> Nested Loop Left Join (cost=0.00..9238379092.22 > rows=347192844 width=24) > > Thnx. > > Ciao, > Gerhard > > -- > http://www.wiesinger.com/ > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance >
pgsql-performance by date: