Thread: How to improve sql query to achieve the better plan
I have the below query which is taking 1873 ms. How can I improve this?
explain analyze select
sum(coalesce(price_cents, 0)::bigint * coalesce(quantity, 0) * (1 - coalesce(workitems.discount, 0)/ 100)) as total_budget_cents,
sum(coalesce(price_cents, 0)::bigint * coalesce(quantity, 0) * (1 - coalesce(workitems.discount, 0)/ 100) + coalesce(additional_cost_cents, 0) - coalesce(cost_reduction_cents, 0)) as final_budget_cents,
projects.id as project_id
from
projects
left join workitems on
workitems.project_id = projects.id
where
workitems.deleted_at is null
group by
order by
project_id asc
And explain output is:
Sort (cost=62851.33..62856.07 rows=1897 width=35) (actual time=1872.867..1873.003 rows=1229 loops=1)
Sort Key: projects.id
Sort Method: quicksort Memory: 145kB
-> HashAggregate (cost=62719.59..62748.04 rows=1897 width=35) (actual time=1871.281..1872.104 rows=1229 loops=1)
Group Key: projects.id
-> Hash Right Join (cost=159.68..45386.32 rows=364911 width=35) (actual time=2.226..637.936 rows=365784 loops=1)
Hash Cond: (workitems.project_id = projects.id)
Filter: (workitems.deleted_at IS NULL)
Rows Removed by Filter: 257457
-> Seq Scan on workitems (cost=0.00..36655.53 rows=623353 width=43) (actual time=0.020..220.215 rows=623175 loops=1)
-> Hash (cost=135.97..135.97 rows=1897 width=16) (actual time=2.177..2.177 rows=1897 loops=1)
Buckets: 2048 Batches: 1 Memory Usage: 105kB
-> Seq Scan on projects (cost=0.00..135.97 rows=1897 width=16) (actual time=0.013..1.451 rows=1897 loops=1)
Planning time: 2.775 ms
Execution time: 1873.308 ms
Projects table has the index:
Indexes:
"projects_pkey" PRIMARY KEY, btree (id)
"index_projects_on_company_id" btree (company_id)
"index_projects_on_deleted_at" btree (deleted_at)
"index_projects_on_inspector_id" btree (inspector_id)
"index_projects_on_managed_offline_by_user_id" btree (managed_offline_by_user_id)
"index_projects_on_project_status_id" btree (project_status_id)
"index_projects_on_shipyard_id" btree (shipyard_id)
"index_projects_on_vessel_id" btree (vessel_id)
Workitems table has the index:
Indexes:
"workitems_pkey" PRIMARY KEY, btree (id)
"index_workitems_on_company_id" btree (company_id)
"index_workitems_on_deleted_at" btree (deleted_at)
"index_workitems_on_parent_workitem_id" btree (parent_workitem_id)
"index_workitems_on_project_id" btree (project_id)
"index_workitems_on_standard_workitem_id" btree (standard_workitem_id)
"index_workitems_on_workitem_category_id" btree (workitem_category_id)
Hi
ne 30. 9. 2018 v 18:23 odesílatel Arup Rakshit <ar@zeit.io> napsal:
I have the below query which is taking 1873 ms. How can I improve this?explain analyze selectsum(coalesce(price_cents, 0)::bigint * coalesce(quantity, 0) * (1 - coalesce(workitems.discount, 0)/ 100)) as total_budget_cents,sum(coalesce(price_cents, 0)::bigint * coalesce(quantity, 0) * (1 - coalesce(workitems.discount, 0)/ 100) + coalesce(additional_cost_cents, 0) - coalesce(cost_reduction_cents, 0)) as final_budget_cents,projects.id as project_idfromprojectsleft join workitems onworkitems.project_id = projects.idwhereworkitems.deleted_at is nullgroup byorder byproject_id ascAnd explain output is:Sort (cost=62851.33..62856.07 rows=1897 width=35) (actual time=1872.867..1873.003 rows=1229 loops=1)Sort Key: projects.idSort Method: quicksort Memory: 145kB-> HashAggregate (cost=62719.59..62748.04 rows=1897 width=35) (actual time=1871.281..1872.104 rows=1229 loops=1)Group Key: projects.id-> Hash Right Join (cost=159.68..45386.32 rows=364911 width=35) (actual time=2.226..637.936 rows=365784 loops=1)Hash Cond: (workitems.project_id = projects.id)Filter: (workitems.deleted_at IS NULL)Rows Removed by Filter: 257457-> Seq Scan on workitems (cost=0.00..36655.53 rows=623353 width=43) (actual time=0.020..220.215 rows=623175 loops=1)-> Hash (cost=135.97..135.97 rows=1897 width=16) (actual time=2.177..2.177 rows=1897 loops=1)Buckets: 2048 Batches: 1 Memory Usage: 105kB-> Seq Scan on projects (cost=0.00..135.97 rows=1897 width=16) (actual time=0.013..1.451 rows=1897 loops=1)Planning time: 2.775 msExecution time: 1873.308 ms
maybe conditional index can help
CREATE INDEX ON workitems(project_id) WHERE deleted_at is null
Regards
Pavel
Projects table has the index:Indexes:"projects_pkey" PRIMARY KEY, btree (id)"index_projects_on_company_id" btree (company_id)"index_projects_on_deleted_at" btree (deleted_at)"index_projects_on_inspector_id" btree (inspector_id)"index_projects_on_managed_offline_by_user_id" btree (managed_offline_by_user_id)"index_projects_on_project_status_id" btree (project_status_id)"index_projects_on_shipyard_id" btree (shipyard_id)"index_projects_on_vessel_id" btree (vessel_id)Workitems table has the index:Indexes:"workitems_pkey" PRIMARY KEY, btree (id)"index_workitems_on_company_id" btree (company_id)"index_workitems_on_deleted_at" btree (deleted_at)"index_workitems_on_parent_workitem_id" btree (parent_workitem_id)"index_workitems_on_project_id" btree (project_id)"index_workitems_on_standard_workitem_id" btree (standard_workitem_id)"index_workitems_on_workitem_category_id" btree (workitem_category_id)
I just added it as you said, but I am getting same plan.
Sort (cost=62842.16..62846.91 rows=1897 width=35) (actual time=1845.831..1845.950 rows=1229 loops=1)
Sort Key: projects.id
Sort Method: quicksort Memory: 145kB
-> HashAggregate (cost=62710.42..62738.88 rows=1897 width=35) (actual time=1844.178..1845.060 rows=1229 loops=1)
Group Key: projects.id
-> Hash Right Join (cost=159.68..45382.09 rows=364807 width=35) (actual time=1.534..618.717 rows=365784 loops=1)
Hash Cond: (workitems.project_id = projects.id)
Filter: (workitems.deleted_at IS NULL)
Rows Removed by Filter: 257457
-> Seq Scan on workitems (cost=0.00..36653.75 rows=623175 width=43) (actual time=0.047..213.842 rows=623175 loops=1)
-> Hash (cost=135.97..135.97 rows=1897 width=16) (actual time=1.478..1.478 rows=1897 loops=1)
Buckets: 2048 Batches: 1 Memory Usage: 105kB
-> Seq Scan on projects (cost=0.00..135.97 rows=1897 width=16) (actual time=0.006..0.914 rows=1897 loops=1)
Planning time: 0.498 ms
Execution time: 1846.100 ms
——————
Indexes:
"workitems_pkey" PRIMARY KEY, btree (id)
"index_workitems_on_company_id" btree (company_id)
"index_workitems_on_deleted_at" btree (deleted_at)
"index_workitems_on_parent_workitem_id" btree (parent_workitem_id)
"index_workitems_on_project_id" btree (project_id)
"index_workitems_on_standard_workitem_id" btree (standard_workitem_id)
"index_workitems_on_workitem_category_id" btree (workitem_category_id)
"patrial_index_workitems_200_1" btree (project_id) WHERE deleted_at IS NULL
On 30-Sep-2018, at 10:15 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:CREATE INDEX ON workitems(project_id) WHERE deleted_at is null
ne 30. 9. 2018 v 18:49 odesílatel Arup Rakshit <ar@zeit.io> napsal:
I just added it as you said, but I am getting same plan.Sort (cost=62842.16..62846.91 rows=1897 width=35) (actual time=1845.831..1845.950 rows=1229 loops=1)Sort Key: projects.idSort Method: quicksort Memory: 145kB-> HashAggregate (cost=62710.42..62738.88 rows=1897 width=35) (actual time=1844.178..1845.060 rows=1229 loops=1)Group Key: projects.id-> Hash Right Join (cost=159.68..45382.09 rows=364807 width=35) (actual time=1.534..618.717 rows=365784 loops=1)Hash Cond: (workitems.project_id = projects.id)Filter: (workitems.deleted_at IS NULL)Rows Removed by Filter: 257457-> Seq Scan on workitems (cost=0.00..36653.75 rows=623175 width=43) (actual time=0.047..213.842 rows=623175 loops=1)-> Hash (cost=135.97..135.97 rows=1897 width=16) (actual time=1.478..1.478 rows=1897 loops=1)Buckets: 2048 Batches: 1 Memory Usage: 105kB-> Seq Scan on projects (cost=0.00..135.97 rows=1897 width=16) (actual time=0.006..0.914 rows=1897 loops=1)Planning time: 0.498 msExecution time: 1846.100 ms
Then there is not too much what can be done better - maybe you can try PostgreSQL 11 with paralel hash join -- it is process about 6M rows, the time about 2 sec is good
——————Indexes:"workitems_pkey" PRIMARY KEY, btree (id)"index_workitems_on_company_id" btree (company_id)"index_workitems_on_deleted_at" btree (deleted_at)"index_workitems_on_parent_workitem_id" btree (parent_workitem_id)"index_workitems_on_project_id" btree (project_id)"index_workitems_on_standard_workitem_id" btree (standard_workitem_id)"index_workitems_on_workitem_category_id" btree (workitem_category_id)"patrial_index_workitems_200_1" btree (project_id) WHERE deleted_at IS NULLOn 30-Sep-2018, at 10:15 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:CREATE INDEX ON workitems(project_id) WHERE deleted_at is null