Home > mailing lists

Re: Encouraging multi-table join order - Mailing list pgsql-performance

From	Dan Harris
Subject	Re: Encouraging multi-table join order
Date	April 10, 2006 21:49:16
Msg-id	443AEF9B.3000909@drivefaster.net Whole thread Raw
In response to	Re: Encouraging multi-table join order (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Encouraging multi-table join order
List	pgsql-performance

Tree view

Tom Lane wrote:
> That's very strange --- the estimated cost of the seqscan is high enough
> that the planner should have chosen a nestloop with inner indexscan on
> the big table.  I'm not sure about the join-order point, but the hash
> plan for the first join seems wrong in any case.
>
> Um, you do have an index on eventactivity.incidentid, right?  What's the
> datatype(s) of the incidentid columns?  What happens to the plan if you
> turn off enable_hashjoin and enable_mergejoin?
>
>             regards, tom lane
>
Yes, eventactivity.incidentid is indexed.  The datatype is varchar(40).
Although, by checking this, I noticed that k_h.incidentid was
varchar(100).  Perhaps the difference in length between the keys caused
the planner to not use the fastest method?  I have no defense as to why
those aren't the same.. I will make them so and check.

Here's the EXPLAIN analyze with enable_hashjoin = off and
enable_mergejoin = off :

Limit  (cost=4226535.73..4226544.46 rows=698 width=82) (actual
time=74339.016..74356.521 rows=888 loops=1)
   ->  Unique  (cost=4226535.73..4226544.46 rows=698 width=82) (actual
time=74339.011..74354.073 rows=888 loops=1)
         ->  Sort  (cost=4226535.73..4226537.48 rows=698 width=82)
(actual time=74339.003..74344.031 rows=3599 loops=1)
               Sort Key: eventmain.entrydate, eventmain.incidentid,
eventgeo.eventlocation, eventactivity.recordtext
               ->  Nested Loop  (cost=0.00..4226502.76 rows=698
width=82) (actual time=921.325..74314.959 rows=3599 loops=1)
                     ->  Nested Loop  (cost=0.00..4935.61 rows=731
width=72) (actual time=166.354..14638.308 rows=1162 loops=1)
                           ->  Nested Loop  (cost=0.00..2482.47 rows=741
width=50) (actual time=150.396..7348.013 rows=1162 loops=1)
                                 ->  Index Scan using k_h_id_idx on k_h
(cost=0.00..217.55 rows=741 width=14) (actual time=129.540..1022.243
rows=1162 loops=1)
                                       Index Cond: (id = 33396)
                                       Filter: ((entrydate >=
'2006-01-01 00:00:00'::timestamp without time zone) AND (entrydate <
'2006-04-08 00:00:00'::timestamp without time zone))
                                 ->  Index Scan using
eventgeo_incidentid_idx on eventgeo  (cost=0.00..3.04 rows=1 width=36)
(actual time=5.260..5.429 rows=1 loops=1162)
                                       Index Cond:
((eventgeo.incidentid)::text = ("outer".incidentid)::text)
                           ->  Index Scan using eventmain_incidentid_idx
on eventmain  (cost=0.00..3.30 rows=1 width=22) (actual
time=5.976..6.259 rows=1 loops=1162)
                                 Index Cond:
((eventmain.incidentid)::text = ("outer".incidentid)::text)
                     ->  Index Scan using eventactivity1 on
eventactivity  (cost=0.00..5774.81 rows=20 width=52) (actual
time=29.768..51.334 rows=3 loops=1162)
                           Index Cond: (("outer".incidentid)::text =
(eventactivity.incidentid)::text)
                           Filter: ((((' '::text || (recordtext)::text)
|| ' '::text) ~~ '%HAL%'::text) AND (entrydate >= '2006-01-01
00:00:00'::timestamp without time zone) AND (entrydate < '2006-04-08
00:00:00'::timestamp without time zone))

pgsql-performance by date:

From: Tom Lane
Date: 10 April 2006, 20:59:56
Subject: Re: Encouraging multi-table join order

From: Tom Lane
Date: 10 April 2006, 23:05:15
Subject: Re: Encouraging multi-table join order

Re: Encouraging multi-table join order - Mailing list pgsql-performance

Previous

Next