Poor performance o - Mailing list pgsql-performance
From | Craig A. James |
---|---|
Subject | Poor performance o |
Date | |
Msg-id | 442080D1.5090803@modgraph-usa.com Whole thread Raw |
In response to | Re: qsort again (was Re: Strange Create Index behaviour) (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Poor performance o
|
List | pgsql-performance |
I'm reposting this -- I sent this out a month ago but never got a response, and hope someone can shed some light on this. Thanks, Craig -------------------------- This is a straightforward query that should be fairly quick, but takes about 30 minutes. It's a query across three tables,call them A, B, and C. The tables are joined on indexed columns. Here's a quick summary: Table A -----> Table B -----> Table C A_ID B_ID C_ID A_ID NAME C_ID Tables A and B have 6 million rows each. Table C is small: 67 names, no repeats. All columns involved in the join are indexed. The database has been full-vacuumed and analyzed. Summary: 1. Query B only: 2.7 seconds, 302175 rows returned 2. Join B and C: 4.3 seconds, exact same answer 3. Join A and B: 7.2 minutes, exact same answer 4. Join A, B, C: 32.7 minutes, exact same answer Looking at these: Query #1 is doing the real work: finding the rows of interest. Queries #1 and #2 ought to be virtually identical, since Table C has just one row with C_ID = 9, but the time almost doubles. Query #3 should take a bit longer than Query #1 because it has to join 300K rows, but the indexes should make this take just a few seconds, certainly well under a minute. Query #4 should be identical to Query #3, again because there's only one row in Table C. 32 minutes is pretty horrible for such a straightforward query. It looks to me like the problem is the use of nested loops when a hash join should be used, but I'm no expert at query planning. This is psql 8.0.3. Table definitions are at the end. Hardware is a Dell, 2-CPU Xeon, 4 GB memory, database is on a singleSATA 7200RPM disk. These table and column names are altered to protect the guilty, otherwise these are straight from Postgres. QUERY #1: --------- explain analyze select B.A_ID from B where B.B_ID = 9; Index Scan using i_B_B_ID on B (cost=0.00..154401.36 rows=131236 width=4) (actual time=0.158..1387.251 rows=302175 loops=1) Index Cond: (B_ID = 9) Total runtime: 2344.053 ms QUERY #2: --------- explain analyze select B.A_ID from B join C on (B.C_ID = C.C_ID) where C.name = 'Joe'; Nested Loop (cost=0.00..258501.92 rows=177741 width=4) (actual time=0.349..3392.532 rows=302175 loops=1) -> Seq Scan on C (cost=0.00..12.90 rows=1 width=4) (actual time=0.232..0.336 rows=1 loops=1) Filter: ((name)::text = 'Joe'::text) -> Index Scan using i_B_C_ID on B (cost=0.00..254387.31 rows=328137 width=8) (actual time=0.102..1290.002 rows=302175 loops=1) Index Cond: (B.C_ID = "outer".C_ID) Total runtime: 4373.916 ms QUERY #3: --------- explain analyze select A.A_ID from A join B on (A.A_ID = B.A_ID) where B.B_ID = 9; Nested Loop (cost=0.00..711336.41 rows=131236 width=4) (actual time=37.118..429419.347 rows=302175 loops=1) -> Index Scan using i_B_B_ID on B (cost=0.00..154401.36 rows=131236 width=4) (actual time=27.344..8858.489 rows=302175loops=1) Index Cond: (B_ID = 9) -> Index Scan using pk_A_test on A (cost=0.00..4.23 rows=1 width=4) (actual time=1.372..1.376 rows=1 loops=302175) Index Cond: (A.A_ID = "outer".A_ID) Total runtime: 430467.686 ms QUERY #4: --------- explain analyze select A.A_ID from A join B on (A.A_ID = B.A_ID) join C on (B.B_ID = C.B_ID) where C.name = 'Joe'; Nested Loop (cost=0.00..1012793.38 rows=177741 width=4) (actual time=70.184..1960112.247 rows=302175 loops=1) -> Nested Loop (cost=0.00..258501.92 rows=177741 width=4) (actual time=52.114..17753.638 rows=302175 loops=1) -> Seq Scan on C (cost=0.00..12.90 rows=1 width=4) (actual time=0.109..0.176 rows=1 loops=1) Filter: ((name)::text = 'Joe'::text) -> Index Scan using i_B_B_ID on B (cost=0.00..254387.31 rows=328137 width=8) (actual time=51.985..15566.896 rows=302175loops=1) Index Cond: (B.B_ID = "outer".B_ID) -> Index Scan using pk_A_test on A (cost=0.00..4.23 rows=1 width=4) (actual time=6.407..6.412 rows=1 loops=302175) Index Cond: (A.A_ID = "outer".A_ID) Total runtime: 1961200.079 ms TABLE DEFINITIONS: ------------------ xxx => \d a Table "xxx.a" Column | Type | Modifiers ------------------+------------------------+----------- a_id | integer | not null ... more columns Indexes: "pk_a_id" PRIMARY KEY, btree (a_id) ... more indexes on other columns xxx => \d b Table "xxx.b" Column | Type | Modifiers -------------------------+------------------------+----------- b_id | integer | not null a_id | integer | not null c_id | integer | not null ... more columns Indexes: "b_pkey" PRIMARY KEY, btree (b_id) "i_b_a_id" btree (a_id) "i_b_c_id" btree (c_id) xxx=> \d c Table "xxx.c" Column | Type | Modifiers --------------+------------------------+----------- c_id | integer | not null name | character varying(200) | ... more columns Indexes: "c_pkey" PRIMARY KEY, btree (c_id)
pgsql-performance by date: