Redundant sub query triggers slow nested loop left join - Mailing list pgsql-performance
From | henk de wit |
---|---|
Subject | Redundant sub query triggers slow nested loop left join |
Date | |
Msg-id | BAY106-F1923F1C1D05D94B1BAF671F5550@phx.gbl Whole thread Raw |
Responses |
Re: Redundant sub query triggers slow nested loop left join
|
List | pgsql-performance |
In investigating a slow query, I distiled the code below from a larger query: SELECT * FROM /* SUBQUERY banners */ ( SELECT * FROM /* SUBQUERY banners_links */ ( SELECT * FROM banners_links WHERE merchant_id = 5631 ) as banners_links WHERE merchant_id = 5631 AND banners_links.status = 0 ) AS banners LEFT OUTER JOIN /* SUBQUERY types */ ( SELECT banner_types.id AS type_id, banner_types.type AS type, banners_banner_types.banner_id AS id FROM banner_types,banners_banner_types WHERE banners_banner_types.banner_id IN /* SUBQUERY */ ( SELECT id FROM banners_links WHERE merchant_id = 5631 ) AND banners_banner_types.type_id = banner_types.id ) AS types USING (id) Obviously, the subquery "banners_links" is redundant. The query however is a generated one, and this redundancy probably wasn't noted before. Logically you would say it shouldn't hurt, but in fact it does. The above query executes painfully slow. The left outer join is killing the performance, as witnessed by the plan: "Nested Loop Left Join (cost=964.12..1480.67 rows=1 width=714) (actual time=20.801..8233.410 rows=553 loops=1)" " Join Filter: (public.banners_links.id = banners_banner_types.banner_id)" " -> Bitmap Heap Scan on banners_links (cost=4.35..42.12 rows=1 width=671) (actual time=0.127..0.690 rows=359 loops=1)" " Recheck Cond: ((merchant_id = 5631) AND (merchant_id = 5631))" " Filter: ((status)::text = '0'::text)" " -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..4.35 rows=10 width=0) (actual time=0.092..0.092 rows=424 loops=1)" " Index Cond: ((merchant_id = 5631) AND (merchant_id = 5631))" " -> Hash Join (cost=959.77..1432.13 rows=514 width=51) (actual time=0.896..22.611 rows=658 loops=359)" " Hash Cond: (banners_banner_types.type_id = banner_types.id)" " -> Hash IN Join (cost=957.32..1422.52 rows=540 width=16) (actual time=0.894..21.878 rows=658 loops=359)" " Hash Cond: (banners_banner_types.banner_id = public.banners_links.id)" " -> Seq Scan on banners_banner_types (cost=0.00..376.40 rows=22240 width=16) (actual time=0.003..10.149 rows=22240 loops=359)" " -> Hash (cost=952.02..952.02 rows=424 width=8) (actual time=0.779..0.779 rows=424 loops=1)" " -> Bitmap Heap Scan on banners_links (cost=11.54..952.02 rows=424 width=8) (actual time=0.108..0.513 rows=424 loops=1)" " Recheck Cond: (merchant_id = 5631)" " -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..11.43 rows=424 width=0) (actual time=0.078..0.078 rows=424 loops=1)" " Index Cond: (merchant_id = 5631)" " -> Hash (cost=2.20..2.20 rows=20 width=43) (actual time=0.033..0.033 rows=20 loops=1)" " -> Seq Scan on banner_types (cost=0.00..2.20 rows=20 width=43) (actual time=0.004..0.017 rows=20 loops=1)" "Total runtime: 8233.710 ms" I noticed that the recheck condition looks a bit weird: Recheck Cond: ((merchant_id = 5631) AND (merchant_id = 5631)) You would think that PG (8.2.3) would be smart enough to optimize this away. Also the estimate of the nested loop left join and the actual results are way off. I tried increasing the statistics of both public.banners_links.id and banners_banner_types.banner_id (to the highest value 1000), analyzed, vacuum analyzed and did a vacuum full, but without any improvements. Anyway, when I remove the redundant sub query the code becomes: SELECT * FROM /* SUBQUERY banners */ ( SELECT * FROM banners_links WHERE merchant_id = 5631 AND banners_links.status = 0 ) AS banners LEFT OUTER JOIN /* SUBQUERY types */ ( SELECT banner_types.id AS type_id, banner_types.type AS type, banners_banner_types.banner_id AS id FROM banner_types,banners_banner_types WHERE banners_banner_types.banner_id IN /* SUBQUERY */ ( SELECT id FROM banners_links WHERE merchant_id = 5631 ) AND banners_banner_types.type_id = banner_types.id ) AS types USING (id) With this query, the execution time drops from 8 seconds to a mere 297 ms! The plan now looks as follows: "Hash Left Join (cost=1449.99..2392.68 rows=2 width=714) (actual time=24.257..25.292 rows=553 loops=1)" " Hash Cond: (public.banners_links.id = banners_banner_types.banner_id)" " -> Bitmap Heap Scan on banners_links (cost=11.43..954.03 rows=2 width=671) (actual time=0.122..0.563 rows=359 loops=1)" " Recheck Cond: (merchant_id = 5631)" " Filter: ((status)::text = '0'::text)" " -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..11.43 rows=424 width=0) (actual time=0.086..0.086 rows=424 loops=1)" " Index Cond: (merchant_id = 5631)" " -> Hash (cost=1432.13..1432.13 rows=514 width=51) (actual time=24.128..24.128 rows=658 loops=1)" " -> Hash Join (cost=959.77..1432.13 rows=514 width=51) (actual time=1.714..23.606 rows=658 loops=1)" " Hash Cond: (banners_banner_types.type_id = banner_types.id)" " -> Hash IN Join (cost=957.32..1422.52 rows=540 width=16) (actual time=1.677..22.811 rows=658 loops=1)" " Hash Cond: (banners_banner_types.banner_id = public.banners_links.id)" " -> Seq Scan on banners_banner_types (cost=0.00..376.40 rows=22240 width=16) (actual time=0.005..10.306 rows=22240 loops=1)" " -> Hash (cost=952.02..952.02 rows=424 width=8) (actual time=0.772..0.772 rows=424 loops=1)" " -> Bitmap Heap Scan on banners_links (cost=11.54..952.02 rows=424 width=8) (actual time=0.105..0.510 rows=424 loops=1)" " Recheck Cond: (merchant_id = 5631)" " -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..11.43 rows=424 width=0) (actual time=0.077..0.077 rows=424 loops=1)" " Index Cond: (merchant_id = 5631)" " -> Hash (cost=2.20..2.20 rows=20 width=43) (actual time=0.032..0.032 rows=20 loops=1)" " -> Seq Scan on banner_types (cost=0.00..2.20 rows=20 width=43) (actual time=0.004..0.018 rows=20 loops=1)" "Total runtime: 25.602 ms" We see that instead of a nested loop left join the planner now chooses a Hash Left Join. I'm not really an expert in this matter and would like some more insight into what's happening here. Naively I would say that a planner would have to be smart enough to see this by itself? Thanks in advance for all hints. _________________________________________________________________ Talk with your online friends with Messenger http://www.join.msn.com/messenger/overview
pgsql-performance by date: