Re: How can I make PosgreSQL use an Index ? - Mailing list pgsql-admin
From | Tom Lane |
---|---|
Subject | Re: How can I make PosgreSQL use an Index ? |
Date | |
Msg-id | 19517.1064934536@sss.pgh.pa.us Whole thread Raw |
In response to | Re: How can I make PosgreSQL use an Index ? (Rhaoni Chiu Pereira <rhaoni@sistemica.info>) |
List | pgsql-admin |
Rhaoni Chiu Pereira <rhaoni@sistemica.info> writes: > [ very large join plan ] Sorry for not responding sooner --- somehow this got overlooked in my inbox. It looks to me like the big problem is that you have all these nested-loop joins: > -> Nested Loop (cost=100025960.94..100027780.65 rows=1 > width=818) (actual time=3476.87..99606.77 rows=19923 loops=1) > Join Filter: (("outer".emp = "inner".emp) AND ("outer".fil > = "inner".fil) AND ("outer".codigo_supervisor = "inner".codigo_supervisor) AND > ("outer".codigo_grupo_rep = "inner".codigo_grupo_rep)) > -> Nested Loop (cost=100025960.94..100027775.22 rows=1 > width=765) (actual time=3476.74..97802.69 rows=19923 loops=1) > Join Filter: (("inner".ano_mes)::text = to_char > ("outer".data_emissao, 'YYYYMM'::text)) > -> Nested Loop (cost=25960.94..27762.92 rows=1 > width=755) (actual time=3475.14..32090.12 rows=19923 loops=1) > Join Filter: (("inner".emp = "outer".emp) AND > ("outer".fil = "inner".fil) AND ("outer".codigo = "inner".cla_categoria) AND > ("outer".codigo = "inner".cla_marca) AND ("outer".codigo = "inner".situacao)) > -> Nested Loop (cost=25960.94..27705.22 > rows=10 width=665) (actual time=3474.12..17734.21 rows=199230 loops=1) > Join Filter: (("outer".emp > = "inner".emp) AND ("inner".fil = "outer".fil)) > -> Nested Loop > (cost=25960.94..27699.30 rows=1 width=638) (actual time=3474.02..6030.09 > rows=19923 loops=1) > Join Filter: (("inner".emp > = "outer".emp) AND ("inner".empfil = "outer".fil)) The reason the planner is choosing nestloop here is that it thinks there are only a few rows involved (notice the estimated row counts are all "1" or "10"). Nestloop is a fine join plan for small numbers of rows, but it pretty well sucks for tens of thousands of rows which is what you've actually got. I am not sure why the row-count estimates are so far off, but it could be because the planner is unaware of cross-column correlations in your data. Are the multiple join conditions actually necessary, or are some of them redundant? A quick and dirty thing you could try to see if the plan can be improved is to set "enable_nestloop" off (do NOT set "enable_seqscan" off). In the long run you want the planner to do better without such a brute-force hack, though. A simple answer is to boost the statistics target on the join columns and re-analyze, but that may not help much if the real issue is cross-column correlations. A more invasive solution is to reconsider your data design. It looks like you have a lot of multi-column join keys --- can you find ways to combine those into single columns? (As an example, I'd never build a table containing separate date and time columns rather than a single timestamp column.) I realize that this'd probably be a pain in the neck, but if you can do it, it would simplify your queries as well as help the planner produce better plans. regards, tom lane
pgsql-admin by date: