Re: Join query on 1M row table slow - Mailing list pgsql-general
From | CSN |
---|---|
Subject | Re: Join query on 1M row table slow |
Date | |
Msg-id | 20040210223727.29023.qmail@web40605.mail.yahoo.com Whole thread Raw |
In response to | Re: Join query on 1M row table slow ("scott.marlowe" <scott.marlowe@ihs.com>) |
Responses |
Re: Join query on 1M row table slow
|
List | pgsql-general |
I disabled enable_hashagg and enable_nestloop. Appears to have made both queries worse :( QUERY 1: => explain analyze select p.* from product_categories pc inner join products p on pc.product_id = p.id where pc.category_id = 1016 order by p.title limit 25 offset 10000; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=65999.78..65999.78 rows=1 width=290) (actual time=7736.029..7736.029 rows=0 loops=1) -> Sort (cost=65997.31..65999.78 rows=986 width=290) (actual time=7723.794..7730.352 rows=2358 loops=1) Sort Key: p.title -> Merge Join (cost=65306.35..65948.28 rows=986 width=290) (actual time=7028.790..7614.223 rows=2358 loops=1) Merge Cond: ("outer".product_id = "inner".id) -> Sort (cost=3656.31..3658.78 rows=986 width=4) (actual time=102.115..105.357 rows=2358 loops=1) Sort Key: pc.product_id -> Index Scan using idx_pc_category_id on product_categories pc (cost=0.00..3607.28 rows=986 width=4) (actual time=0.349..94.173 rows=2358 loops=1) Index Cond: (category_id = 1016) -> Sort (cost=61650.04..61963.62 rows=125430 width=290) (actual time=6926.394..7272.130 rows=124521 loops=1) Sort Key: p.id -> Seq Scan on products p (cost=0.00..6638.30 rows=125430 width=290) (actual time=0.102..2855.358 rows=124753 loops=1) Total runtime: 8003.067 ms (13 rows) QUERY 2: => explain analyze select p.* from products p where p.id in ( select product_id from product_categories pc where pc.category_id = 1016) order by p.title limit 25 offset 0; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=10931.85..10931.91 rows=25 width=290) (actual time=3667.396..3667.526 rows=25 loops=1) -> Sort (cost=10931.85..10932.13 rows=111 width=290) (actual time=3667.384..3667.453 rows=25 loops=1) Sort Key: p.title -> Hash Join (cost=3661.52..10928.08 rows=111 width=290) (actual time=111.198..1615.324 rows=2358 loops=1) Hash Cond: ("outer".id = "inner".product_id) -> Seq Scan on products p (cost=0.00..6638.30 rows=125430 width=290) (actual time=0.113..1039.900 rows=124753 loops=1) -> Hash (cost=3661.24..3661.24 rows=111 width=4) (actual time=110.932..110.932 rows=0 loops=1) -> Unique (cost=3656.31..3661.24 rows=111 width=4) (actual time=97.255..106.798 rows=2358 loops=1) -> Sort (cost=3656.31..3658.78 rows=986 width=4) (actual time=97.247..99.998 rows=2358 loops=1) Sort Key: pc.product_id -> Index Scan using idx_pc_category_id on product_categories pc (cost=0.00..3607.28 rows=986 width=4) (actual time=0.327..88.436 rows=2358 loops=1) Index Cond: (category_id = 1016) Total runtime: 3669.479 ms (13 rows) > upping the analyze target on those two tables may > help a bit too. How exactly do I do that? SELECT * from thanks limit 1000 ;) CSN --- "scott.marlowe" <scott.marlowe@ihs.com> wrote: > Well, it looks like the number of rows estimate for > the nested loop in the > first query and the hash agg in the second are off > by a factor 3 for the > first query, and a factor of 20 for the second. Try > running number 1 with > set enable_nestloop = off > and see if the first one gets faster. > > You might also wanna try turning off hash aggregate > on the second one and > see how that works. > > upping the analyze target on those two tables may > help a bit too. > > On Tue, 10 Feb 2004, CSN wrote: > > > > > Doh! Yeah, now I remember ;) > > > > QUERY 1: > > > > => explain analyze select p.* from > product_categories > > pc inner join products p on pc.product_id = p.id > where > > pc.category_id = 1016 order by p.title limit 25 > offset > > 0; > > > > > QUERY PLAN > > > > > > ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Limit (cost=9595.99..9596.05 rows=25 width=290) > > (actual time=514.808..514.942 rows=25 loops=1) > > -> Sort (cost=9595.99..9598.45 rows=986 > > width=290) (actual time=514.794..514.868 rows=25 > > loops=1) > > Sort Key: p.title > > -> Nested Loop (cost=0.00..9546.96 > rows=986 > > width=290) (actual time=0.672..421.732 rows=2358 > > loops=1) > > -> Index Scan using > idx_pc_category_id > > on product_categories pc (cost=0.00..3607.28 > rows=986 > > width=4) (actual time=0.343..125.762 rows=2358 > > loops=1) > > Index Cond: (category_id = > 1016) > > -> Index Scan using > pkey_products_id > > on products p (cost=0.00..6.01 rows=1 width=290) > > (actual time=0.075..0.083 rows=1 loops=2358) > > Index Cond: > ("outer".product_id = > > p.id) > > Total runtime: 516.174 ms > > (9 rows) > > > > > > QUERY 2: > > > > => explain analyze select p.* from products p > where > > p.id in ( select product_id from > product_categories pc > > where pc.category_id = 1016) order by p.title > limit 25 > > offset 0; > > > > > QUERY PLAN > > > > > > --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Limit (cost=4282.18..4282.24 rows=25 width=290) > > (actual time=447.852..447.979 rows=25 loops=1) > > -> Sort (cost=4282.18..4282.46 rows=111 > > width=290) (actual time=447.836..447.904 rows=25 > > loops=1) > > Sort Key: p.title > > -> Nested Loop (cost=3609.75..4278.41 > > rows=111 width=290) (actual time=104.256..358.182 > > rows=2358 loops=1) > > -> HashAggregate > > (cost=3609.75..3609.75 rows=111 width=4) (actual > > time=103.922..114.836 rows=2358 loops=1) > > -> Index Scan using > > idx_pc_category_id on product_categories pc > > (cost=0.00..3607.28 rows=986 width=4) (actual > > time=0.401..92.253 rows=2358 loops=1) > > Index Cond: > (category_id = > > 1016) > > -> Index Scan using > pkey_products_id > > on products p (cost=0.00..6.01 rows=1 width=290) > > (actual time=0.069..0.076 rows=1 loops=2358) > > Index Cond: (p.id = > > "outer".product_id) > > Total runtime: 449.370 ms > > (10 rows) > > > > > > -CSN > > > > > > --- "scott.marlowe" <scott.marlowe@ihs.com> wrote: > > > On Tue, 10 Feb 2004, CSN wrote: > > > > > > > > 2. Vacuum analyze the tables concerned and > post > > > the > > > > > output of EXPLAIN ANALYSE > > > > > with your query. > > > > > > > > => explain analyze; > > > > > > > > results in: > > > > > > > > ERROR: syntax error at or near ";" at > character > > > 16 > > > > > > No silly. you do: > > > > > > explain analyze select ... (rest of the > query...) > > > > > > and it runs the query and tells you how long > each > > > bit took and what it > > > THOUGHT it would get back in terms of number of > rows > > > and what it actually > > > got back. > > > > > > Let us know... > > > > > > > > > __________________________________ > > Do you Yahoo!? > > Yahoo! Finance: Get your refund fast by filing > online. > > http://taxes.yahoo.com/filing.html > > > __________________________________ Do you Yahoo!? Yahoo! Finance: Get your refund fast by filing online. http://taxes.yahoo.com/filing.html
pgsql-general by date: