Hash or merge join instead of inner loop - Mailing list pgsql-performance
From | Jim C. Nasby |
---|---|
Subject | Hash or merge join instead of inner loop |
Date | |
Msg-id | 20030609204009.GJ40542@flake.decibel.org Whole thread Raw |
Responses |
Re: Hash or merge join instead of inner loop
|
List | pgsql-performance |
I have a query that's cauing pgsql choose either a hash or merge join depending on how I mess with the stats variables, but it won't choose an nested loop, even though it's the fastest. The estimate for the nested loop index scans always seems to be way high on the high end. Note that it's 0-3 in one case and 0-2 in the other, but the actual time is very low in both cases. Why is this? I haven't been able to make much of a difference by changing the optimizer variables. This is on a solaris machine, if that matters. Tinput_data, locality, and postal code have 1300, 28000 and 43000 rows, respectively, and locality and postal code are very narrow tables (full definition below). usps=# explain analyze SELECT key, pc.locality_id, l.state_code::varchar FROM Tinput_data i, postal_code pc, locality lWHERE i.zip = pc.postal_code AND l.locality_id = pc.locality_id; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Merge Join (cost=940.20..1417.94 rows=1380 width=36) (actual time=1727.30..2363.91 rows=1380 loops=1) Merge Cond: ("outer".locality_id = "inner".locality_id) -> Index Scan using locality_pkey on locality l (cost=0.00..455.99 rows=27789 width=10) (actual time=0.62..495.39 rows=27632loops=1) -> Sort (cost=940.20..940.55 rows=1380 width=26) (actual time=1725.53..1726.71 rows=1380 loops=1) Sort Key: pc.locality_id -> Merge Join (cost=42.00..933.00 rows=1380 width=26) (actual time=56.27..1684.67 rows=1380 loops=1) Merge Cond: ("outer".postal_code = "inner".zip) -> Index Scan using postal_code_postal_code_key on postal_code pc (cost=0.00..869.31 rows=42704 width=13)(actual time=10.05..1396.11 rows=42418 loops=1) -> Sort (cost=42.00..42.34 rows=1380 width=13) (actual time=39.63..40.97 rows=1380 loops=1) Sort Key: i.zip -> Seq Scan on tinput_data i (cost=0.00..34.80 rows=1380 width=13) (actual time=0.02..12.13 rows=1380loops=1) Total runtime: 2367.50 msec (12 rows) usps=# set enable_mergejoin=0; SET usps=# set enable_hashjoin=0; SET usps=# explain analyze SELECT key, pc.locality_id, l.state_code::varchar FROM Tinput_data i, postal_code pc, locality lWHERE i.zip = pc.postal_code AND l.locality_id = pc.locality_id; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..6991.66 rows=1380 width=36) (actual time=0.22..231.00 rows=1380 loops=1) -> Nested Loop (cost=0.00..4203.23 rows=1380 width=26) (actual time=0.14..132.70 rows=1380 loops=1) -> Seq Scan on tinput_data i (cost=0.00..34.80 rows=1380 width=13) (actual time=0.02..17.41 rows=1380 loops=1) -> Index Scan using postal_code_postal_code_key on postal_code pc (cost=0.00..3.01 rows=1 width=13) (actual time=0.06..0.06rows=1 loops=1380) Index Cond: ("outer".zip = pc.postal_code) -> Index Scan using locality_pkey on locality l (cost=0.00..2.01 rows=1 width=10) (actual time=0.05..0.05 rows=1 loops=1380) Index Cond: (l.locality_id = "outer".locality_id) Total runtime: 233.60 msec (8 rows) Table "pg_temp_1.tinput_data" Column | Type | Modifiers ------------------+-----------------------+----------- key | integer | not null firm | character varying(40) | address | integer | address_v | character varying(10) | odd_even | character(1) | street_name | character varying(40) | street_metaphone | character varying(4) | apartment | integer | apartment_v | character varying(10) | apartment_label | character varying(5) | city | character varying(40) | city_metaphone | character varying(4) | state | character varying(40) | zip | character varying(5) | Indexes: tinput_data_pkey primary key btree ("key") usps=# \d postal_code Table "public.postal_code" Column | Type | Modifiers ----------------+-----------------------+------------------------------------------------------------------------- postal_code_id | integer | not null default nextval('public.postal_code_postal_code_id_seq'::text) postal_code | character varying(10) | not null locality_id | integer | not null Indexes: postal_code_pkey primary key btree (postal_code_id), postal_code_postal_code_key unique btree (postal_code) usps=# \d locality Table "public.locality" Column | Type | Modifiers -------------+-----------------------+------------------------------------------------------------------- locality_id | integer | not null default nextval('public.locality_locality_id_seq'::text) locality | character varying(10) | not null state_code | character(2) | not null Indexes: locality_pkey primary key btree (locality_id) Foreign Key constraints: $1 FOREIGN KEY (state_code) REFERENCES state(state_code) ON UPDATE NO ACTION ON DELETE NO ACTION -- Jim C. Nasby (aka Decibel!) jim@nasby.net Member: Triangle Fraternity, Sports Car Club of America Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
pgsql-performance by date: