Re: Very slow queries - please help - Mailing list pgsql-performance
From | Bealach-na Bo |
---|---|
Subject | Re: Very slow queries - please help |
Date | |
Msg-id | BAY101-F35C5475BF7CCEE56EAB208AD540@phx.gbl Whole thread Raw |
Responses |
Re: Very slow queries - please help
|
List | pgsql-performance |
OK. The consensus seems to be that I need more indexes and I also need to look into the NOT IN statement as a possible bottleneck. I've introduced the indexes which has led to a DRAMATIC change in response time. Now I have to experiment with INNER JOIN -> OUTER JOIN variations, SET ENABLE_SEQSCAN=OFF. Forgive me for not mentioning each person individually and by name. You have all contributed to confirming what I had suspected (and hoped): that *I* have a lot to learn! I'm attaching table descriptions, the first few lines of top output while the queries were running, index lists, sample queries and EXPLAIN ANALYSE output BEFORE and AFTER the introduction of the indexes. As I said, DRAMATIC :) I notice that the CPU usage does not vary very much, it's nearly 100% anyway, but the memory usage drops markedly, which is another very nice result of the index introduction. Any more comments and tips would be very welcome. Thank you all for your input. Bealach. blouser@10.0.0.2.dbdev=> \d job_log Table "blouser.job_log" Column | Type | Modifiers ----------------+-----------------------------+-------------------------------------------------- job_log_id | integer | not null default nextval('job_log_id_seq'::text) first_registry | timestamp without time zone | blogger_name | character(50) | node_id | integer | job_type | character(50) | job_name | character(256) | job_start | timestamp without time zone | job_timeout | interval | job_stop | timestamp without time zone | nfiles_in_job | integer | status | integer | error_code | smallint | Indexes: "job_log_id_pkey" PRIMARY KEY, btree (job_log_id) Check constraints: "job_log_status_check" CHECK (status = 0 OR status = 1 OR status = 8 OR status = 9) Foreign-key constraints: "legal_node" FOREIGN KEY (node_id) REFERENCES node(node_id) blouser@10.0.0.2.dbdev=> \d node Table "blouser.node" Column | Type | Modifiers ---------+---------------+----------------------------------------------- node_id | integer | not null default nextval('node_id_seq'::text) name | character(50) | type | character(1) | usage | character(4) | Indexes: "node_id_pkey" PRIMARY KEY, btree (node_id) "node_name_key" UNIQUE, btree (name) Check constraints: "node_type_check" CHECK ("type" = 'B'::bpchar OR "type" = 'K'::bpchar OR "type" = 'C'::bpchar OR "type" = 'T'::bpchar OR "type" = 'R'::bpchar) "node_usage_check" CHECK ("usage" = 'TEST'::bpchar OR "usage" = 'LIVE'::bpchar) #========================before new indexes were created Tasks: 114 total, 2 running, 112 sleeping, 0 stopped, 0 zombie Cpu(s): 25.7% us, 24.5% sy, 0.0% ni, 49.4% id, 0.3% wa, 0.0% hi, 0.0% si Mem: 1554788k total, 1513576k used, 41212k free, 31968k buffers Swap: 1020024k total, 27916k used, 992108k free, 708728k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 25883 postgres 25 0 20528 12m 11m R 99.7 0.8 4:54.91 postmaster blouser@10.0.0.2.dbdev=> \di List of relations Schema | Name | Type | Owner | Table ---------+-----------------+-------+---------+--------- blouser | job_log_id_pkey | index | blouser | job_log blouser | node_id_pkey | index | blouser | node blouser | node_name_key | index | blouser | node (3 rows) EXPLAIN ANALYSE SELECT n.name,n.type, n.usage, j.status, j.job_start,j.job_stop, j.nfiles_in_job,j.job_name FROM job_log j INNER JOIN node n ON j.node_id = n.node_id WHERE n.name LIKE '711%' AND n.type = 'K' AND n.usage = 'LIVE' AND j.job_name = 'COPY FILES' AND j.job_start >= '2005-11-14 00:00:00' AND (j.job_stop <= '2005-11-14 05:00:00' OR j.job_stop IS NULL) ORDER BY n.name; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..75753.31 rows=1 width=461) (actual time=270486.692..291662.350 rows=3 loops=1) Join Filter: ("inner".node_id = "outer".node_id) -> Index Scan using node_name_key on node n (cost=0.00..307.75 rows=1 width=181) (actual time=0.135..11.034 rows=208 loops=1) Filter: ((name ~~ '711%'::text) AND ("type" = 'K'::bpchar) AND ("usage" = 'LIVE'::bpchar)) -> Seq Scan on job_log j (cost=0.00..75445.54 rows=1 width=288) (actual time=273.374..1402.089 rows=22 loops=208) Filter: ((job_name = 'COPY FILES'::bpchar) AND (job_start >= '2005-11-14 00:00:00'::timestamp without time zone) AND ((job_stop <= '2005-11-14 05:00:00'::timestamp without time zone) OR (job_stop IS NULL))) Total runtime: 291662.482 ms (7 rows) EXPLAIN ANALYSE SELECT n.name, n.type, n.usage FROM node n WHERE n.name LIKE '56x%' AND n.type = 'K' AND n.usage = 'LIVE' AND n.node_id NOT IN (SELECT n.node_id FROM job_log j INNER JOIN node n ON j.node_id = n.node_id WHERE n.name LIKE '711%' AND n.type = 'K' AND n.usage = 'LIVE' AND j.job_name = 'COPY FILES' AND j.job_start >= '2005-11-14 00:00:00' AND (j.job_stop <= '2005-11-14 05:00:00' OR j.job_stop IS NULL)) ORDER BY n.name; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------ Index Scan using node_name_key on node n (cost=75451.55..75764.94 rows=1 width=177) (actual time=1394.617..1398.609 rows=205 loops=1) Filter: ((name ~~ '56x%'::text) AND ("type" = 'K'::bpchar) AND ("usage" = 'LIVE'::bpchar) AND (NOT (hashed subplan))) SubPlan -> Nested Loop (cost=0.00..75451.54 rows=1 width=4) (actual time=1206.622..1394.462 rows=3 loops=1) -> Seq Scan on job_log j (cost=0.00..75445.54 rows=1 width=4) (actual time=271.361..1393.363 rows=22 loops=1) Filter: ((job_name = 'COPY FILES'::bpchar) AND (job_start >= '2005-11-14 00:00:00'::timestamp without time zone) AND ((job_stop <= '2005-11-14 05:00:00'::timestamp without time zone) OR (job_stop IS NULL))) -> Index Scan using node_id_pkey on node n (cost=0.00..5.99 rows=1 width=4) (actual time=0.042..0.042 rows=0 loops=22) Index Cond: ("outer".node_id = n.node_id) Filter: ((name ~~ '711%'::text) AND ("type" = 'K'::bpchar) AND ("usage" = 'LIVE'::bpchar)) Total runtime: 1398.868 ms (10 rows) #===================================after the new indexes were created Tasks: 114 total, 2 running, 112 sleeping, 0 stopped, 0 zombie Cpu(s): 22.9% us, 27.2% sy, 0.0% ni, 49.7% id, 0.0% wa, 0.2% hi, 0.0% si Mem: 1554788k total, 1414632k used, 140156k free, 14784k buffers Swap: 1020024k total, 28008k used, 992016k free, 623652k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 26409 postgres 25 0 21580 8684 7116 R 99.9 0.6 0:25.38 postmaster Schema | Name | Type | Owner | Table ---------+--------------------+-------+---------+--------- blouser | idx_job_log_filter | index | blouser | job_log blouser | idx_node_filter | index | blouser | node blouser | job_log_id_pkey | index | blouser | job_log blouser | node_id_pkey | index | blouser | node blouser | node_name_key | index | blouser | node (5 rows) EXPLAIN ANALYSE SELECT n.name,n.type, n.usage, j.status, j.job_start,j.job_stop, j.nfiles_in_job,j.job_name FROM job_log j INNER JOIN node n ON j.node_id = n.node_id WHERE n.name LIKE '711%' AND n.type = 'K' AND n.usage = 'LIVE' AND j.job_name = 'COPY FILES' AND j.job_start >= '2005-11-14 00:00:00' AND (j.job_stop <= '2005-11-14 05:00:00' OR j.job_stop IS NULL) ORDER BY n.name; ---------------------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=18049.23..18049.23 rows=1 width=461) (actual time=223.540..223.543 rows=3 loops=1) Sort Key: n.name -> Nested Loop (cost=0.00..18049.22 rows=1 width=461) (actual time=201.575..223.470 rows=3 loops=1) -> Index Scan using idx_job_log_filter on job_log j (cost=0.00..18043.21 rows=1 width=288) (actual time=52.567..222.855 rows=22 loops=1) Index Cond: ((job_name = 'COPY FILES'::bpchar) AND (job_start >= '2005-11-14 00:00:00'::timestamp without time zone)) Filter: ((job_stop <= '2005-11-14 05:00:00'::timestamp without time zone) OR (job_stop IS NULL)) -> Index Scan using node_id_pkey on node n (cost=0.00..5.99 rows=1 width=181) (actual time=0.022..0.022 rows=0 loops=22) Index Cond: ("outer".node_id = n.node_id) Filter: ((name ~~ '711%'::text) AND ("type" = 'K'::bpchar) AND ("usage" = 'LIVE'::bpchar)) Total runtime: 223.677 ms (10 rows) EXPLAIN ANALYSE SELECT n.name, n.type, n.usage FROM node n WHERE n.name LIKE '56x%' AND n.type = 'K' AND n.usage = 'LIVE' AND n.node_id NOT IN (SELECT n.node_id FROM job_log j INNER JOIN node n ON j.node_id = n.node_id WHERE n.name LIKE '711%' AND n.type = 'K' AND n.usage = 'LIVE' AND j.job_name = 'COPY FILES' AND j.job_start >= '2005-11-14 00:00:00' AND (j.job_stop <= '2005-11-14 05:00:00' OR j.job_stop IS NULL)) ORDER BY n.name; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=18141.89..18141.89 rows=1 width=177) (actual time=223.495..223.627 rows=205 loops=1) Sort Key: name -> Seq Scan on node n (cost=18049.22..18141.88 rows=1 width=177) (actual time=220.293..222.526 rows=205 loops=1) Filter: ((name ~~ '56x%'::text) AND ("type" = 'K'::bpchar) AND ("usage" = 'LIVE'::bpchar) AND (NOT (hashed subplan))) SubPlan -> Nested Loop (cost=0.00..18049.22 rows=1 width=4) (actual time=198.343..220.195 rows=3 loops=1) -> Index Scan using idx_job_log_filter on job_log j (cost=0.00..18043.21 rows=1 width=4) (actual time=50.748..219.741 rows=22 loops=1) Index Cond: ((job_name = 'COPY FILES'::bpchar) AND (job_start >= '2005-11-14 00:00:00'::timestamp without time zone)) Filter: ((job_stop <= '2005-11-14 05:00:00'::timestamp without time zone) OR (job_stop IS NULL)) -> Index Scan using node_id_pkey on node n (cost=0.00..5.99 rows=1 width=4) (actual time=0.015..0.016 rows=0 loops=22) Index Cond: ("outer".node_id = n.node_id) Filter: ((name ~~ '711%'::text) AND ("type" = 'K'::bpchar) AND ("usage" = 'LIVE'::bpchar)) Total runtime: 223.860 ms (13 rows)
pgsql-performance by date: