Thread: PG 9.5 same SQL 2 different plans
Hi Gurus,
we upgrade to PG 9.5.3, in last days we are seeing a strange optimization issues with one of the SQL :
running same SQL every 15-20 times optimizer will choose( wrong plan)/ most expensive which generates approx
50 GB temp files and runs for aprox 20 min , we can not understand the reason ( we run vacuum analyze daily),
Pg conf values:
random_page_cost=3.0
defalult_statistics_taget=100
work_mem=400MB
Here is the SQL :
explain analyze SELECT ObservationSkeleton.lastModified,ObservationSkeleton.maxLastModified,ObservationSkeleton.stateCode,ObservationSkeleton.obsID,PlaneSkeleton.lastModified,PlaneSkeleton.maxLastModified,PlaneSkeleton.stateCode,PlaneSkeleton.planeID,ArtifactSkeleton.lastModified,ArtifactSkeleton.maxLastModified,ArtifactSkeleton.stateCode,ArtifactSkeleton.artifactID,PartSkeleton.lastModified,PartSkeleton.maxLastModified,PartSkeleton.stateCode,PartSkeleton.partID,ChunkSkeleton.lastModified,ChunkSkeleton.maxLastModified,ChunkSkeleton.stateCode,ChunkSkeleton.chunkID FROM cvodb.caom2.Observation AS ObservationSkeleton LEFT OUTER JOIN cvodb.caom2.Plane AS PlaneSkeleton LEFT OUTER JOIN cvodb.caom2.Artifact AS ArtifactSkeleton LEFT OUTER JOIN cvodb.caom2.Part AS PartSkeleton LEFT OUTER JOIN cvodb.caom2.Chunk AS ChunkSkeleton ON PartSkeleton.partID = ChunkSkeleton.partID ON ArtifactSkeleton.artifactID = PartSkeleton.artifactID ON PlaneSkeleton.planeID = ArtifactSkeleton.planeID ON ObservationSkeleton.obsID = PlaneSkeleton.obsID WHERE ObservationSkeleton.obsID = '00000000-0000-0000-0603-e3ee45c5b6b7' ORDER BY ObservationSkeleton.obsID,PlaneSkeleton.planeID,ArtifactSkeleton.artifactID,PartSkeleton.partID;
***Here is the wrong plan( using > 40GB temp files , running SEQ SCAN on aprox 60 mill rows, and exec time 20 min)
**** And the correct plan ( no temp files , no seq scan, exec time 6ms)
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=24779053.69..24779054.41 rows=289 width=180) (actual time=4.539..4.564 rows=264 loops=1)
Sort Key: planeskeleton.planeid, artifactskeleton.artifactid, partskeleton.partid
Sort Method: quicksort Memory: 95kB
-> Nested Loop Left Join (cost=2.69..24779041.88 rows=289 width=180) (actual time=0.375..3.967 rows=264 loops=1)
Join Filter: (observationskeleton.obsid = planeskeleton.obsid)
-> Index Scan using observation_pkey on observation observationskeleton (cost=0.43..6.45 rows=1 width=36) (actual time=0.077..0.077 rows=1 loops=1)
Index Cond: (obsid = '00000000-0000-0000-0603-e3ee45c5b6b7'::uuid)
-> Nested Loop Left Join (cost=2.25..24779031.81 rows=289 width=160) (actual time=0.285..3.702 rows=264 loops=1)
-> Index Scan using i_obsid on plane planeskeleton (cost=0.56..88.02 rows=28 width=52) (actual time=0.078..0.109 rows=8 loops=1)
Index Cond: (obsid = '00000000-0000-0000-0603-e3ee45c5b6b7'::uuid)
-> Nested Loop Left Join (cost=1.69..884948.84 rows=1344 width=124) (actual time=0.143..0.436 rows=33 loops=8)
-> Nested Loop Left Join (cost=1.13..878643.86 rows=1344 width=88) (actual time=0.099..0.154 rows=33 loops=8)
-> Index Scan using i_planeid on artifact artifactskeleton (cost=0.56..1077.84 rows=356 width=52) (actual time=0.051..0.053 rows=2 loops=8)
Index Cond: (planeskeleton.planeid = planeid)
-> Index Scan using i_artifactid on part partskeleton (cost=0.57..2453.44 rows=1163 width=52) (actual time=0.035..0.052 rows=22 loops=12)
Index Cond: (artifactskeleton.artifactid = artifactid)
-> Index Scan using i_partid on chunk chunkskeleton (cost=0.56..4.68 rows=1 width=52) (actual time=0.006..0.007 rows=1 loops=264)
Index Cond: (partskeleton.partid = partid)
Planning time: 8.030 ms
Execution time: 5.128 ms
Isabella
we upgrade to PG 9.5.3, in last days we are seeing a strange optimization issues with one of the SQL :
running same SQL every 15-20 times optimizer will choose( wrong plan)/ most expensive which generates approx
50 GB temp files and runs for aprox 20 min , we can not understand the reason ( we run vacuum analyze daily),
Pg conf values:
random_page_cost=3.0
defalult_statistics_taget=100
work_mem=400MB
Here is the SQL :
explain analyze SELECT ObservationSkeleton.lastModified,ObservationSkeleton.maxLastModified,ObservationSkeleton.stateCode,ObservationSkeleton.obsID,PlaneSkeleton.lastModified,PlaneSkeleton.maxLastModified,PlaneSkeleton.stateCode,PlaneSkeleton.planeID,ArtifactSkeleton.lastModified,ArtifactSkeleton.maxLastModified,ArtifactSkeleton.stateCode,ArtifactSkeleton.artifactID,PartSkeleton.lastModified,PartSkeleton.maxLastModified,PartSkeleton.stateCode,PartSkeleton.partID,ChunkSkeleton.lastModified,ChunkSkeleton.maxLastModified,ChunkSkeleton.stateCode,ChunkSkeleton.chunkID FROM cvodb.caom2.Observation AS ObservationSkeleton LEFT OUTER JOIN cvodb.caom2.Plane AS PlaneSkeleton LEFT OUTER JOIN cvodb.caom2.Artifact AS ArtifactSkeleton LEFT OUTER JOIN cvodb.caom2.Part AS PartSkeleton LEFT OUTER JOIN cvodb.caom2.Chunk AS ChunkSkeleton ON PartSkeleton.partID = ChunkSkeleton.partID ON ArtifactSkeleton.artifactID = PartSkeleton.artifactID ON PlaneSkeleton.planeID = ArtifactSkeleton.planeID ON ObservationSkeleton.obsID = PlaneSkeleton.obsID WHERE ObservationSkeleton.obsID = '00000000-0000-0000-0603-e3ee45c5b6b7' ORDER BY ObservationSkeleton.obsID,PlaneSkeleton.planeID,ArtifactSkeleton.artifactID,PartSkeleton.partID;
***Here is the wrong plan( using > 40GB temp files , running SEQ SCAN on aprox 60 mill rows, and exec time 20 min)
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------Sort (cost=34130921.12..34130939.82 rows=7480 width=180) Sort Key: planeskeleton.planeid, artifactskeleton.artifactid, partskeleton.partid -> Nested Loop Left Join (cost=8160852.43..34130439.82 rows=7480 width=180) Join Filter: (observationskeleton.obsid = planeskeleton.obsid) -> Index Scan using observation_pkey on observation observationskeleton (cost=0.43..6.45 rows=1 width=36) Index Cond: (obsid = '00000000-0000-0000-07a6-0143085f9ddf'::uuid) -> Hash Right Join (cost=8160851.99..34130339.87 rows=7480 width=160) Hash Cond: (artifactskeleton.planeid = planeskeleton.planeid) -> Hash Right Join (cost=8158554.13..33707311.38 rows=112174888 width=124) Hash Cond: (partskeleton.artifactid = artifactskeleton.artifactid) -> Hash Right Join (cost=6305604.98..22889335.83 rows=112174888 width=88) Hash Cond: (chunkskeleton.partid = partskeleton.partid) -> Seq Scan on chunk chunkskeleton (cost=0.00..5047469.16 rows=60691316 width=52) -> Hash (cost=3807960.88..3807960.88 rows=112174888 width=52) -> Seq Scan on part partskeleton (cost=0.00..3807960.88 rows=112174888 width=52) -> Hash (cost=1194034.40..1194034.40 rows=29593340 width=52) -> Seq Scan on artifact artifactskeleton (cost=0.00..1194034.40 rows=29593340 width=52) -> Hash (cost=2288.29..2288.29 rows=766 width=52) -> Bitmap Heap Scan on plane planeskeleton (cost=21.50..2288.29 rows=766 width=52) Recheck Cond: (obsid = '00000000-0000-0000-07a6-0143085f9ddf'::uuid) -> Bitmap Index Scan on i_obsid (cost=0.00..21.30 rows=766 width=0) Index Cond: (obsid = '00000000-0000-0000-07a6-0143085f9ddf'::uuid)
(22 rows)
**** And the correct plan ( no temp files , no seq scan, exec time 6ms)
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=24779053.69..24779054.41 rows=289 width=180) (actual time=4.539..4.564 rows=264 loops=1)
Sort Key: planeskeleton.planeid, artifactskeleton.artifactid, partskeleton.partid
Sort Method: quicksort Memory: 95kB
-> Nested Loop Left Join (cost=2.69..24779041.88 rows=289 width=180) (actual time=0.375..3.967 rows=264 loops=1)
Join Filter: (observationskeleton.obsid = planeskeleton.obsid)
-> Index Scan using observation_pkey on observation observationskeleton (cost=0.43..6.45 rows=1 width=36) (actual time=0.077..0.077 rows=1 loops=1)
Index Cond: (obsid = '00000000-0000-0000-0603-e3ee45c5b6b7'::uuid)
-> Nested Loop Left Join (cost=2.25..24779031.81 rows=289 width=160) (actual time=0.285..3.702 rows=264 loops=1)
-> Index Scan using i_obsid on plane planeskeleton (cost=0.56..88.02 rows=28 width=52) (actual time=0.078..0.109 rows=8 loops=1)
Index Cond: (obsid = '00000000-0000-0000-0603-e3ee45c5b6b7'::uuid)
-> Nested Loop Left Join (cost=1.69..884948.84 rows=1344 width=124) (actual time=0.143..0.436 rows=33 loops=8)
-> Nested Loop Left Join (cost=1.13..878643.86 rows=1344 width=88) (actual time=0.099..0.154 rows=33 loops=8)
-> Index Scan using i_planeid on artifact artifactskeleton (cost=0.56..1077.84 rows=356 width=52) (actual time=0.051..0.053 rows=2 loops=8)
Index Cond: (planeskeleton.planeid = planeid)
-> Index Scan using i_artifactid on part partskeleton (cost=0.57..2453.44 rows=1163 width=52) (actual time=0.035..0.052 rows=22 loops=12)
Index Cond: (artifactskeleton.artifactid = artifactid)
-> Index Scan using i_partid on chunk chunkskeleton (cost=0.56..4.68 rows=1 width=52) (actual time=0.006..0.007 rows=1 loops=264)
Index Cond: (partskeleton.partid = partid)
Planning time: 8.030 ms
Execution time: 5.128 ms
Isabella
ghiureai <isabella.ghiurea@nrc-cnrc.gc.ca> writes: > we upgrade to PG 9.5.3, in last days we are seeing a strange > optimization issues with one of the SQL : > running same SQL every 15-20 times optimizer will choose( wrong > plan)/ most expensive which generates approx > 50 GB temp files and runs for aprox 20 min , we can not understand the > reason ( we run vacuum analyze daily), It looks like it's flipping between two different plans depending on the estimate of the number of "planeskeleton" rows matching the particular "obsid" value you're requesting. The cost estimates for those plans aren't that far apart (34M units vs 25M), but reality is way different. > Pg conf values: > random_page_cost=3.0 > defalult_statistics_taget=100 I think you have two problems here. The big one is that the planner is way overestimating the actual costs of indexscans, which probably means your database is entirely held in RAM and you ought to knock random_page_cost down to 1. (But see the usual caveats that fooling with cost parameters on the basis of a single example query is dangerous.) A lesser problem is that the rowcount estimates aren't very close, which also contributes to overestimating the costs of indexscans. It's possible that would get better if you increased default_statistics_target, though it's hard to be sure. regards, tom lane