Thread: Any ideas how can I speed up this query?
--------------+------------------------+----------------------------------------------------------
id | integer | not null default nextval('occurrences_id_seq'::regclass)
gbifid | integer | not null
sname | character varying(512) |
cname | character varying(512) |
species | character varying(512) |
location | geometry | not null
month | integer |
year | integer |
event_date | date |
dataset_key | character varying(512) |
taxon_key | character varying(512) |
taxon_rank | character varying(512) |
record_basis | character varying(512) |
category_id | integer |
country | character varying(512) |
lat | double precision |
lng | double precision |
Indexes:
"occurrences_pkey" PRIMARY KEY, btree (id)
"unique_occurrences_gbifid" UNIQUE, btree (gbifid)
"index_occurences_taxon_key" btree (taxon_key)
"index_occurrences_category_id" btree (category_id)
"index_occurrences_cname" btree (cname)
"index_occurrences_country" btree (country)
"index_occurrences_lat" btree (lat)
"index_occurrences_lng" btree (lng)
"index_occurrences_month" btree (month)
"index_occurrences_sname" btree (sname)
"occurrence_location_gix" gist (location)
SELECT COUNT(*) FROM "occurrences" WHERE ("lat" >= -27.91550355958 AND "lat" <= -27.015680440420002 AND "lng" >= 152.13307044728307 AND "lng" <= 153.03137355271693 AND "category_id" = 1 AND (ST_Intersects( ST_Buffer(ST_PointFromText('POINT(152.582222 -27.465592)')::geography, 50000)::geography, location::geography)));
The problem is it takes more than acceptable time to execute the query. Below is the explain analyze output for the same query.
EXPLAIN ANALYZE QUERY OUTPUT (http://explain.depesz.com/s/p2a)
Aggregate (cost=127736.06..127736.07 rows=1 width=0) (actual time=13491.678..13491.679 rows=1 loops=1)
Buffers: shared hit=3 read=56025
-> Bitmap Heap Scan on occurrences (cost=28249.46..127731.08 rows=1995 width=0) (actual time=528.053..13388.458 rows=167511 loops=1)
Recheck Cond: ((lat >= (-27.91550355958)::double precision) AND (lat <= (-27.01568044042)::double precision) AND (lng >= 152.133070447283::double precision) AND (lng <= 153.031373552717::double precision))
Rows Removed by Index Recheck: 748669
Filter: ((category_id = 1) AND ('0103000020E6100000010000002100000090D8AD28D32263403905504558773BC0CADDAF0384226340E7AD43F4E38D3BC0B559D93A98216340B7BE554692A33BC0C904C18C18206340DF8EA9338DB73BC052F75181131E6340A1D9E30E0FC93BC00BDCB5E39C1B6340A1A40E496AD73BC074D30D03CD1863405DD7BF5110E23BC05DD3A2C0BF156340439B784797E83BC078E9287593126340EF9E5C37BEEA3BC072EB40B9670F63409E25A1BA6FE83BC06964481F5C0C6340B331B5D2C2E13BC08F6785ED8E0963409979FBF9F9D63BC0135DB3E71B0763402F78807480C83BC0E321E8351B0563405E96CB00E6B63BC0672CD874A00363403BC84B1BD9A23BC018FAE8F8B90263400314CD15208D3BC039DE8C4A70026340653B324F91763BC0F5BA5CDFC502634086322DDE0A603BC0E90E8A10B7036340C5FA1C046A4A3BC01ECCC14C3A056340CE4011BC82363BC022F38481400763407655DBB517253BC05B3B5AB6B50963404C9AA306D3163BC079EF01D3810C634010A732D03F0C3BC05152F188890F634044CE5D16C5053BC0EDDABA57AF126340926E14EFA1033BC08D7E9CA3D41563401EFB9E2DEB053BC071A929D5DA186340A3F1F29C8A0C3BC049A7E478A41B63404F8BD2CF3F173BC04B409855161E634057CC1080A2253BC0F9C65E70182063404BEB146926373BC0349D83F596216340449EEA7C204B3BC07803D7FD82226340A16F1747CD603BC090D8AD28D32263403905504558773BC0'::geography && (location)::geography) AND (_st_distance('0103000020E6100000010000002100000090D8AD28D32263403905504558773BC0CADDAF0384226340E7AD43F4E38D3BC0B559D93A98216340B7BE554692A33BC0C904C18C18206340DF8EA9338DB73BC052F75181131E6340A1D9E30E0FC93BC00BDCB5E39C1B6340A1A40E496AD73BC074D30D03CD1863405DD7BF5110E23BC05DD3A2C0BF156340439B784797E83BC078E9287593126340EF9E5C37BEEA3BC072EB40B9670F63409E25A1BA6FE83BC06964481F5C0C6340B331B5D2C2E13BC08F6785ED8E0963409979FBF9F9D63BC0135DB3E71B0763402F78807480C83BC0E321E8351B0563405E96CB00E6B63BC0672CD874A00363403BC84B1BD9A23BC018FAE8F8B90263400314CD15208D3BC039DE8C4A70026340653B324F91763BC0F5BA5CDFC502634086322DDE0A603BC0E90E8A10B7036340C5FA1C046A4A3BC01ECCC14C3A056340CE4011BC82363BC022F38481400763407655DBB517253BC05B3B5AB6B50963404C9AA306D3163BC079EF01D3810C634010A732D03F0C3BC05152F188890F634044CE5D16C5053BC0EDDABA57AF126340926E14EFA1033BC08D7E9CA3D41563401EFB9E2DEB053BC071A929D5DA186340A3F1F29C8A0C3BC049A7E478A41B63404F8BD2CF3F173BC04B409855161E634057CC1080A2253BC0F9C65E70182063404BEB146926373BC0349D83F596216340449EEA7C204B3BC07803D7FD82226340A16F1747CD603BC090D8AD28D32263403905504558773BC0'::geography, (location)::geography, 0::double precision, false) < 1e-05::double precision))
Rows Removed by Filter: 6357
Heap Blocks: exact=29947 lossy=22601
Buffers: shared hit=3 read=56025
-> BitmapAnd (cost=28249.46..28249.46 rows=32476 width=0) (actual time=519.091..519.091 rows=0 loops=1)
Buffers: shared read=3477
-> Bitmap Index Scan on index_occurrences_lat (cost=0.00..11691.20 rows=365877 width=0) (actual time=218.999..218.999 rows=392415 loops=1)
Index Cond: ((lat >= (-27.91550355958)::double precision) AND (lat <= (-27.01568044042)::double precision))
Buffers: shared read=1444
-> Bitmap Index Scan on index_occurrences_lng (cost=0.00..16557.01 rows=517658 width=0) (actual time=285.211..285.211 rows=550523 loops=1)
Index Cond: ((lng >= 152.133070447283::double precision) AND (lng <= 153.031373552717::double precision))
Buffers: shared read=2033
Planning time: 2.812 ms
Execution time: 13493.617 ms
(19 rows)
It seems that the planner is underestimating the number of rows returned in Bitmap Heap Scan on occurrences. I have run vacuum analyze on this table couple of times, but it still produces the same result. Any idea how I can speed up this query? How I can assist planner in providing better row estimates for Bitmap Heap Scan section?
POSTGRESQL VERSION INFO
version
------------------------------------------------------------------------------------------------------
PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2, 64-bit
> > QUERY > > SELECT COUNT(*) FROM "occurrences" WHERE ("lat" >= -27.91550355958 AND "lat" <= -27.015680440420002 AND "lng" >= 152.13307044728307AND "lng" <= 153.03137355271693 AND "category_id" = 1 AND (ST_Intersects( ST_Buffer(ST_PointFromText('POINT(152.582222-27.465592)')::geography, 50000)::geography, location::geography))); > How I can assist planner in providing better row estimates for Bitmap Heap Scan section? By googling this phrase from your EXPLAIN: "Rows Removed by Index Recheck: 748669" - you can find this explanation: http://stackoverflow.com/questions/26418715/postgresql-rows-removed-by-index "The inner Bitmap Index Scan node is producing a bitmap, putting 1 to all the places where records that match your searchkey are found, and 0 otherwise. As your table is quite big, the size of the bitmap is getting bigger, then availablememory for these kind of operations, configured via work_mem, becomes small to keep the whole bitmap. When in lack of a memory, inner node will start producing 1 not for records, but rather for blocks that are known to containmatching records. This means, that outer node Bitmap Heap Scan has to read all records from such block and re-checkthem. Obiously, there'll be some non-matching ones, and their number is what you see as Rows Removed by Index Recheck." Therefore, try substantially increasing your work_mem (use set..... so that it's on a per-session basis, not global) sothat you don't have to read in all the rows to re-check them. This is why Googling phrases from your explain before list-posting is always a good idea :-) BTW - what are your statistics set to? If you have a huge table, it can be worth raising them from the default. http://www.postgresql.org/docs/9.4/static/planner-stats.html ALTER TABLE SET STATISTICS, try raising this to 1000. > POSTGRESQL VERSION INFO For postgis-related questions, remember to also include the postgis version. Hope this helps and good luck Graeme Bell.