Thread: PostgreSQL 12.8 Same Query Same Execution Plan Different Time
Hello: I ran 2 same queries on PostgreSQL 12.8 machine running in AWS RDS, the first time I ran the query the query plan was: GroupAggregate (cost=455652.07..455664.99 rows=340 width=16) (actual time=124047.119..124048.777 rows=294 loops=1) Group Key: dvh.history_date -> Sort (cost=455652.07..455655.24 rows=1269 width=8) (actual time=124046.989..124047.857 rows=7780 loops=1) Sort Key: dvh.history_date Sort Method: quicksort Memory: 557kB -> Nested Loop (cost=13708.92..455586.66 rows=1269 width=8) (actual time=12.228..124039.597 rows=7780 loops=1) -> HashAggregate (cost=13708.35..13746.45 rows=3810 width=4) (actual time=12.210..21.629 rows=6119 loops=1) Group Key: pdv.id -> Nested Loop (cost=0.84..13698.83 rows=3810 width=4) (actual time=0.038..10.650 rows=6119 loops=1) -> Index Scan using idx_pc_device_acct_num on pc_devices (cost=0.42..565.32 rows=506 width=4)(actual time=0.014..0.193 rows=396 loops=1) Index Cond: ((account_number)::text = 'AB32823833'::text) -> Index Scan using pdv_uindex on pdv (cost=0.42..25.73 rows=23 width=8) (actual time=0.007..0.024rows=15 loops=396) Index Cond: (pc_device_id = pc_devices.id) -> Index Scan using idx_pc_dvh_dvh_id on pdv_history dvh (cost=0.57..115.96 rows=1 width=12) (actual time=0.342..20.265rows=1 loops=6119) Index Cond: (pdv_id = pdv.id) Filter: (status_changed AND (status_id = 1)) Rows Removed by Filter: 187 Planning Time: 1.050 ms Execution Time: 124048.850 ms After that, I reran the same query again. The plan is basically the same: GroupAggregate (cost=455652.07..455664.99 rows=340 width=16) (actual time=12180.624..12182.286 rows=294 loops=1) Group Key: dvh.history_date -> Sort (cost=455652.07..455655.24 rows=1269 width=8) (actual time=12180.493..12181.363 rows=7780 loops=1) Sort Key: dvh.history_date Sort Method: quicksort Memory: 557kB -> Nested Loop (cost=13708.92..455586.66 rows=1269 width=8) (actual time=1709.341..12177.249 rows=7780 loops=1) -> HashAggregate (cost=13708.35..13746.45 rows=3810 width=4) (actual time=1709.319..1713.171 rows=6119 loops=1) Group Key: pdv.id -> Nested Loop (cost=0.84..13698.83 rows=3810 width=4) (actual time=0.379..1706.606 rows=6119 loops=1) -> Index Scan using idx_pc_device_acct_num on pc_devices (cost=0.42..565.32 rows=506 width=4)(actual time=0.013..0.279 rows=396 loops=1) Index Cond: ((account_number)::text = 'AB32823833'::text) -> Index Scan using pdv_uindex on pdv (cost=0.42..25.73 rows=23 width=8) (actual time=0.289..4.306rows=15 loops=396) Index Cond: (pc_device_id = pc_devices.id) -> Index Scan using idx_pc_dvh_dvh_id on pdv_history dvh (cost=0.57..115.96 rows=1 width=12) (actual time=0.063..1.709rows=1 loops=6119) Index Cond: (pdv_id = pdv.id) Filter: (status_changed AND (status_id = 1)) Rows Removed by Filter: 187 Planning Time: 1.262 ms Execution Time: 12182.361 ms But the gap in the execution time between the two same queries is quite huge : 2 minutes vs 12 seconds. I noticed that different is actually in Nested Loop join. One is taking 2 minutes, other is taking 12 seconds. I find thispuzzling as I assume the nested loop should be done in memory. The disk is gp2 SDD so I'm even more baffled by this. What could be the factors that affect the speed of nested loop. I noticefor that both loops the rows is 7780 and loops is 1. I don't think those are big numbers It was only after the running the 2 queries that I realize I could do EXPLAIN (ANALYZE, BUFFERS), but I couldn't reproducethe slowness. Below are other information that might be relevant: The database has been vacuum analyzed before running the queries. Platform : AWS RDS PG version : 12.8 effective_cache_size : 7.9 GB (7935800kB) shared_buffers : 3.9 GB (3967896kB) work_mem : 64MB random_page_cost : 1.1 Instance type : db.m6g.xlarge (4 vCPUs / 32 GB RAM) Database is idle. I'm the only one running the query version : PostgreSQL 12.8 on aarch64-unknown-linux-gnu, compiled by gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6), 64-bit Thanks, Ludwig
I noticed that different is actually in Nested Loop join. One is taking 2 minutes, other is taking 12 seconds. I find this puzzling as I assume the nested loop should be done in memory.
The disk is gp2 SDD so I'm even more baffled by this. What could be the factors that affect the speed of nested loop. I notice for that both loops the rows is 7780 and loops is 1. I don't think those are big numbers
It was only after the running the 2 queries that I realize I could do EXPLAIN (ANALYZE, BUFFERS), but I couldn't reproduce the slowness.
Michel SALAIS
De : David G. Johnston <david.g.johnston@gmail.com>
Envoyé : mercredi 19 janvier 2022 16:11
À : Ludwig Isaac Lim <ludz_lim@yahoo.com>
Cc : pgsql-performance@postgresql.org
Objet : Re: PostgreSQL 12.8 Same Query Same Execution Plan Different Time
On Wed, Jan 19, 2022 at 7:59 AM Ludwig Isaac Lim <ludz_lim@yahoo.com> wrote:
I noticed that different is actually in Nested Loop join. One is taking 2 minutes, other is taking 12 seconds. I find this puzzling as I assume the nested loop should be done in memory.
Everything is done in memory, but the data has to get there first (hence BUFFERS as you figured out below).
The disk is gp2 SDD so I'm even more baffled by this. What could be the factors that affect the speed of nested loop. I notice for that both loops the rows is 7780 and loops is 1. I don't think those are big numbers
The loops are ~= 400 and 6,000
It was only after the running the 2 queries that I realize I could do EXPLAIN (ANALYZE, BUFFERS), but I couldn't reproduce the slowness.
Did you (can you even in RDS) attempt to clear those buffers? If the first query ran slowly because none of the data was in memory (which you don't know for certain because you didn't run with BUFFERS option then) then subsequent runs would indeed be faster (the implementation of shared buffers having fulfilled one of its major purposes in life).
I'll agree buffers for that query does not seem to account for nearly two minutes...though as RDS is a shared resource I'd probably chalk at least some of it to contention on the underlying hardware (disk likely being more problematic than memory).
David J.
Hi,
Another point to check is eventually IOPS…
It depends on the contracted service, If the quantity of IOPS is guaranteed or not. When it is not guaranteed and a sufficiently heavy load (in I/O) was executed for a while, the value of IOPS falls down dramatically and then you are sure to have performance problems…
Michel SALAIS