Sequential scan instead of index scan - Mailing list pgsql-performance
From | Ioannis Anagnostopoulos |
---|---|
Subject | Sequential scan instead of index scan |
Date | |
Msg-id | 501FDDD5.2000106@anatec.com Whole thread Raw |
Responses |
Re: Sequential scan instead of index scan
Re: Sequential scan instead of index scan |
List | pgsql-performance |
Hi, my query is very simple:
select
msg_id,
msg_type,
ship_pos_messages.pos_georef1,
ship_pos_messages.pos_georef2,
ship_pos_messages.pos_georef3,
ship_pos_messages.pos_georef4,
obj_id,
ship_speed,
ship_heading,
ship_course,
pos_point
from
feed_all_y2012m08.ship_pos_messages
where
extract('day' from msg_date_rec) = 1
AND msg_id = any(ARRAY[7294724,14174174,22254408]);
The msg_id is the pkey on the ship_pos_messages table and in this example it is working fast as it uses the pkey (primary key index) to make the selection. The expplain anayze follows:
"Result (cost=0.00..86.16 rows=5 width=117) (actual time=128.734..163.319 rows=3 loops=1)"
" -> Append (cost=0.00..86.16 rows=5 width=117) (actual time=128.732..163.315 rows=3 loops=1)"
" -> Seq Scan on ship_pos_messages (cost=0.00..0.00 rows=1 width=100) (actual time=0.001..0.001 rows=0 loops=1)"
" Filter: ((msg_id = ANY ('{7294724,14174174,22254408}'::integer[])) AND (date_part('day'::text, msg_date_rec) = 1::double precision))"
" -> Seq Scan on ship_a_pos_messages ship_pos_messages (cost=0.00..0.00 rows=1 width=100) (actual time=0.000..0.000 rows=0 loops=1)"
" Filter: ((msg_id = ANY ('{7294724,14174174,22254408}'::integer[])) AND (date_part('day'::text, msg_date_rec) = 1::double precision))"
" -> Bitmap Heap Scan on ship_b_std_pos_messages ship_pos_messages (cost=13.41..25.42 rows=1 width=128) (actual time=49.127..49.127 rows=0 loops=1)"
" Recheck Cond: (msg_id = ANY ('{7294724,14174174,22254408}'::integer[]))"
" Filter: (date_part('day'::text, msg_date_rec) = 1::double precision)"
" -> Bitmap Index Scan on ship_b_std_pos_messages_pkey (cost=0.00..13.41 rows=3 width=0) (actual time=49.125..49.125 rows=0 loops=1)"
" Index Cond: (msg_id = ANY ('{7294724,14174174,22254408}'::integer[]))"
" -> Bitmap Heap Scan on ship_b_ext_pos_messages ship_pos_messages (cost=12.80..24.62 rows=1 width=128) (actual time=0.029..0.029 rows=0 loops=1)"
" Recheck Cond: (msg_id = ANY ('{7294724,14174174,22254408}'::integer[]))"
" Filter: (date_part('day'::text, msg_date_rec) = 1::double precision)"
" -> Bitmap Index Scan on ship_b_ext_pos_messages_pkey (cost=0.00..12.80 rows=3 width=0) (actual time=0.027..0.027 rows=0 loops=1)"
" Index Cond: (msg_id = ANY ('{7294724,14174174,22254408}'::integer[]))"
" -> Bitmap Heap Scan on ship_a_pos_messages_wk0 ship_pos_messages (cost=24.08..36.12 rows=1 width=128) (actual time=79.572..114.152 rows=3 loops=1)"
" Recheck Cond: (msg_id = ANY ('{7294724,14174174,22254408}'::integer[]))"
" Filter: (date_part('day'::text, msg_date_rec) = 1::double precision)"
" -> Bitmap Index Scan on ship_a_pos_messages_wk0_pkey (cost=0.00..24.08 rows=3 width=0) (actual time=67.441..67.441 rows=3 loops=1)"
" Index Cond: (msg_id = ANY ('{7294724,14174174,22254408}'::integer[]))"
"Total runtime: 180.146 ms"
I think this is a pretty good plan and quite quick given the size of the table (88Million rows at present). However in real life the parameter where I search for msg_id is not an array of 3 ids but of 300.000 or more. It is then that the query forgets the plan and goes to sequential scan. Is there any way around? Or is this the best I can have?
Kind Regards
Yiannis
select
msg_id,
msg_type,
ship_pos_messages.pos_georef1,
ship_pos_messages.pos_georef2,
ship_pos_messages.pos_georef3,
ship_pos_messages.pos_georef4,
obj_id,
ship_speed,
ship_heading,
ship_course,
pos_point
from
feed_all_y2012m08.ship_pos_messages
where
extract('day' from msg_date_rec) = 1
AND msg_id = any(ARRAY[7294724,14174174,22254408]);
The msg_id is the pkey on the ship_pos_messages table and in this example it is working fast as it uses the pkey (primary key index) to make the selection. The expplain anayze follows:
"Result (cost=0.00..86.16 rows=5 width=117) (actual time=128.734..163.319 rows=3 loops=1)"
" -> Append (cost=0.00..86.16 rows=5 width=117) (actual time=128.732..163.315 rows=3 loops=1)"
" -> Seq Scan on ship_pos_messages (cost=0.00..0.00 rows=1 width=100) (actual time=0.001..0.001 rows=0 loops=1)"
" Filter: ((msg_id = ANY ('{7294724,14174174,22254408}'::integer[])) AND (date_part('day'::text, msg_date_rec) = 1::double precision))"
" -> Seq Scan on ship_a_pos_messages ship_pos_messages (cost=0.00..0.00 rows=1 width=100) (actual time=0.000..0.000 rows=0 loops=1)"
" Filter: ((msg_id = ANY ('{7294724,14174174,22254408}'::integer[])) AND (date_part('day'::text, msg_date_rec) = 1::double precision))"
" -> Bitmap Heap Scan on ship_b_std_pos_messages ship_pos_messages (cost=13.41..25.42 rows=1 width=128) (actual time=49.127..49.127 rows=0 loops=1)"
" Recheck Cond: (msg_id = ANY ('{7294724,14174174,22254408}'::integer[]))"
" Filter: (date_part('day'::text, msg_date_rec) = 1::double precision)"
" -> Bitmap Index Scan on ship_b_std_pos_messages_pkey (cost=0.00..13.41 rows=3 width=0) (actual time=49.125..49.125 rows=0 loops=1)"
" Index Cond: (msg_id = ANY ('{7294724,14174174,22254408}'::integer[]))"
" -> Bitmap Heap Scan on ship_b_ext_pos_messages ship_pos_messages (cost=12.80..24.62 rows=1 width=128) (actual time=0.029..0.029 rows=0 loops=1)"
" Recheck Cond: (msg_id = ANY ('{7294724,14174174,22254408}'::integer[]))"
" Filter: (date_part('day'::text, msg_date_rec) = 1::double precision)"
" -> Bitmap Index Scan on ship_b_ext_pos_messages_pkey (cost=0.00..12.80 rows=3 width=0) (actual time=0.027..0.027 rows=0 loops=1)"
" Index Cond: (msg_id = ANY ('{7294724,14174174,22254408}'::integer[]))"
" -> Bitmap Heap Scan on ship_a_pos_messages_wk0 ship_pos_messages (cost=24.08..36.12 rows=1 width=128) (actual time=79.572..114.152 rows=3 loops=1)"
" Recheck Cond: (msg_id = ANY ('{7294724,14174174,22254408}'::integer[]))"
" Filter: (date_part('day'::text, msg_date_rec) = 1::double precision)"
" -> Bitmap Index Scan on ship_a_pos_messages_wk0_pkey (cost=0.00..24.08 rows=3 width=0) (actual time=67.441..67.441 rows=3 loops=1)"
" Index Cond: (msg_id = ANY ('{7294724,14174174,22254408}'::integer[]))"
"Total runtime: 180.146 ms"
I think this is a pretty good plan and quite quick given the size of the table (88Million rows at present). However in real life the parameter where I search for msg_id is not an array of 3 ids but of 300.000 or more. It is then that the query forgets the plan and goes to sequential scan. Is there any way around? Or is this the best I can have?
Kind Regards
Yiannis
pgsql-performance by date: