Thread: Small query plan change, big performance difference
Greetings,
I have several similar queries that are all suffering from a dramatic slow down after upgrading a RDS instance from 9.3 to 10.3. The query time goes from 28 milliseconds to over 70 seconds I could use some help trying to figure out the problem. This is one of the queries:
SELECT
r.rid as id,
r.name,
u._firstlastname as owner
FROM resource_form r
JOIN aw_user u ON (u.rid=r.fk_user)
LEFT JOIN resource_form_user p on (p.fk_form=r.rid)
WHERE r.fk_user=1 or p.fk_user=1
ORDER BY r.name, r.rid
Using Explain analyze, I get this on 10.3 (https://explain.depesz.com/s/pAdC):
+------------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
+------------------------------------------------------------------------------------------------------------------------------------------------+
| Sort (cost=201.35..201.42 rows=27 width=68) (actual time=77590.682..77590.683 rows=8 loops=1) |
| Sort Key: r.name, r.rid |
| Sort Method: quicksort Memory: 25kB |
| -> Nested Loop (cost=127.26..200.71 rows=27 width=68) (actual time=0.519..77590.651 rows=8 loops=1) |
| Join Filter: (r.fk_user = u.rid) |
| Rows Removed by Join Filter: 1052160 |
| -> Index Scan using aw_user_rid_key on aw_user u (cost=0.38..8.39 rows=1 width=840) (actual time=0.023..122.397 rows=131521 loops=1) |
| -> Hash Right Join (cost=126.89..191.84 rows=27 width=40) (actual time=0.004..0.577 rows=8 loops=131521) |
| Hash Cond: (p.fk_form = r.rid) |
| Filter: ((r.fk_user = 1) OR (p.fk_user = 1)) |
| Rows Removed by Filter: 1375 |
| -> Seq Scan on resource_form_user p (cost=0.00..29.90 rows=1990 width=8) (actual time=0.003..0.203 rows=951 loops=131521) |
| -> Hash (cost=93.06..93.06 rows=2706 width=40) (actual time=0.461..0.461 rows=550 loops=1) |
| Buckets: 4096 Batches: 1 Memory Usage: 68kB |
| -> Seq Scan on resource_form r (cost=0.00..93.06 rows=2706 width=40) (actual time=0.005..0.253 rows=550 loops=1) |
| Planning time: 0.322 ms |
| Execution time: 77590.734 ms |
+------------------------------------------------------------------------------------------------------------------------------------------------+
Here is the explain from 9.3 (https://explain.depesz.com/s/rGRf):
+-----------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
+-----------------------------------------------------------------------------------------------------------------------------------------+
| Sort (cost=164.49..164.52 rows=10 width=43) (actual time=28.036..28.038 rows=11 loops=1) |
| Sort Key: r.name, r.rid |
| Sort Method: quicksort Memory: 25kB |
| -> Nested Loop (cost=69.23..164.33 rows=10 width=43) (actual time=21.330..27.318 rows=11 loops=1) |
| -> Hash Right Join (cost=68.81..99.92 rows=10 width=33) (actual time=21.283..27.161 rows=11 loops=1) |
| Hash Cond: (p.fk_form = r.rid) |
| Filter: ((r.fk_user = 1) OR (p.fk_user = 1)) |
| Rows Removed by Filter: 1313 |
| -> Seq Scan on resource_form_user p (cost=0.00..14.08 rows=908 width=8) (actual time=1.316..6.346 rows=908 loops=1) |
| -> Hash (cost=62.25..62.25 rows=525 width=33) (actual time=19.927..19.927 rows=527 loops=1) |
| Buckets: 1024 Batches: 1 Memory Usage: 35kB |
| -> Seq Scan on resource_form r (cost=0.00..62.25 rows=525 width=33) (actual time=1.129..19.540 rows=527 loops=1) |
| -> Index Scan using aw_user_rid_key on aw_user u (cost=0.42..6.43 rows=1 width=18) (actual time=0.009..0.009 rows=1 loops=11) |
| Index Cond: (rid = r.fk_user) |
| Total runtime: 28.171 ms |
+-----------------------------------------------------------------------------------------------------------------------------------------+
The plans are very similar, but the results are quite different. In the 10.3 version, I don’t understand why the Hash Right Join is looping through all 131521 user records. Any thoughts?
Thank you,
Michael
Have you run an analyze on all your tables after the upgrade to 10? The estimates are way off.
> Have you run an analyze on all your tables after the upgrade to 10? The estimates are way off. Thank you. I embarrassingly missed that step. That fixed the problem. In the future, if estimates are way off… I’ll runanalyze.