Re: Encouraging multi-table join order - Mailing list pgsql-performance
From | Dan Harris |
---|---|
Subject | Re: Encouraging multi-table join order |
Date | |
Msg-id | 443AEF9B.3000909@drivefaster.net Whole thread Raw |
In response to | Re: Encouraging multi-table join order (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Encouraging multi-table join order
|
List | pgsql-performance |
Tom Lane wrote: > That's very strange --- the estimated cost of the seqscan is high enough > that the planner should have chosen a nestloop with inner indexscan on > the big table. I'm not sure about the join-order point, but the hash > plan for the first join seems wrong in any case. > > Um, you do have an index on eventactivity.incidentid, right? What's the > datatype(s) of the incidentid columns? What happens to the plan if you > turn off enable_hashjoin and enable_mergejoin? > > regards, tom lane > Yes, eventactivity.incidentid is indexed. The datatype is varchar(40). Although, by checking this, I noticed that k_h.incidentid was varchar(100). Perhaps the difference in length between the keys caused the planner to not use the fastest method? I have no defense as to why those aren't the same.. I will make them so and check. Here's the EXPLAIN analyze with enable_hashjoin = off and enable_mergejoin = off : Limit (cost=4226535.73..4226544.46 rows=698 width=82) (actual time=74339.016..74356.521 rows=888 loops=1) -> Unique (cost=4226535.73..4226544.46 rows=698 width=82) (actual time=74339.011..74354.073 rows=888 loops=1) -> Sort (cost=4226535.73..4226537.48 rows=698 width=82) (actual time=74339.003..74344.031 rows=3599 loops=1) Sort Key: eventmain.entrydate, eventmain.incidentid, eventgeo.eventlocation, eventactivity.recordtext -> Nested Loop (cost=0.00..4226502.76 rows=698 width=82) (actual time=921.325..74314.959 rows=3599 loops=1) -> Nested Loop (cost=0.00..4935.61 rows=731 width=72) (actual time=166.354..14638.308 rows=1162 loops=1) -> Nested Loop (cost=0.00..2482.47 rows=741 width=50) (actual time=150.396..7348.013 rows=1162 loops=1) -> Index Scan using k_h_id_idx on k_h (cost=0.00..217.55 rows=741 width=14) (actual time=129.540..1022.243 rows=1162 loops=1) Index Cond: (id = 33396) Filter: ((entrydate >= '2006-01-01 00:00:00'::timestamp without time zone) AND (entrydate < '2006-04-08 00:00:00'::timestamp without time zone)) -> Index Scan using eventgeo_incidentid_idx on eventgeo (cost=0.00..3.04 rows=1 width=36) (actual time=5.260..5.429 rows=1 loops=1162) Index Cond: ((eventgeo.incidentid)::text = ("outer".incidentid)::text) -> Index Scan using eventmain_incidentid_idx on eventmain (cost=0.00..3.30 rows=1 width=22) (actual time=5.976..6.259 rows=1 loops=1162) Index Cond: ((eventmain.incidentid)::text = ("outer".incidentid)::text) -> Index Scan using eventactivity1 on eventactivity (cost=0.00..5774.81 rows=20 width=52) (actual time=29.768..51.334 rows=3 loops=1162) Index Cond: (("outer".incidentid)::text = (eventactivity.incidentid)::text) Filter: ((((' '::text || (recordtext)::text) || ' '::text) ~~ '%HAL%'::text) AND (entrydate >= '2006-01-01 00:00:00'::timestamp without time zone) AND (entrydate < '2006-04-08 00:00:00'::timestamp without time zone))
pgsql-performance by date: