On Thu, Oct 21, 2010 at 8:51 PM, DM <dm.aeqa@gmail.com> wrote:
> Why is the difference in query plan, and the total runtime. Both tables have
> the same btree index
>
>
> test=# explain analyze select * from test_seqindex1 where sid='AA023';
> QUERY
> PLAN
>
-------------------------------------------------------------------------------------------------------------------------------------
> Index Scan using test_seqindex1_pkey on test_seqindex1 (cost=0.00..8.27
> rows=1 width=28) (actual time=0.017..0.018 rows=1 loops=1)
> Index Cond: ((sid)::text = 'AA023'::text)
> Total runtime: 0.035 ms
> (3 rows)
>
> test=# explain analyze select * from test_seqindex2 where sid='AA023';
> QUERY
> PLAN
>
----------------------------------------------------------------------------------------------------------------------------------
> Bitmap Heap Scan on test_seqindex2 (cost=4.95..275.53 rows=73 width=30)
> (actual time=57.833..71.577 rows=2 loops=1)
> Recheck Cond: ((sid)::text = 'AA023'::text)
> -> Bitmap Index Scan on idx_test_seqindex2_sid (cost=0.00..4.93 rows=73
> width=0) (actual time=34.374..34.374 rows=2 loops=1)
> Index Cond: ((sid)::text = 'AA023'::text)
> Total runtime: 71.599 ms
> (5 rows)
>
>
> test=# \d test_seqindex1
> Table "public.test_seqindex1"
> Column | Type | Modifiers
> --------+-----------------------+-----------
> sid | character varying(13) | not null
> name | character varying(80) |
> Indexes:
> "test_seqindex1_pkey" PRIMARY KEY, btree (sid)
>
> test=# \d test_seqindex2
> Table "public.test_seqindex2"
> Column | Type | Modifiers
> --------+-----------------------+-----------
> eid | integer | not null
> sid | character varying(13) |
> ename | character varying(80) |
> Indexes:
> "test_seqindex2_pkey" PRIMARY KEY, btree (eid)
> "idx_test_seqindex2_sid" btree (sid)
Index scans are very efficient when they return very few rows. If they
return more (like in this case: estimate is 73 rows), the cost goes up
linearly (here: 73 times slower). An index does not contain the
elements according to their physical location, so it results in a lot
of random I/O. Here the planner thinks it's better to do a bitmap
index scan (first mark all pages that contain results and then read
all these pages sequentially) so to have sequential I/O instead, which
is usually faster. If your query really returns 73 rows, it would be a
good choice.
However, for your case, the estimates are off (73 matches estimated vs
2 matches actual). Therefore the planner needs better statistics on
the actual values in your table.
Look for "SET STATISTICS" here:
http://www.postgresql.org/docs/9.0/static/sql-altertable.html
Kind regards,
Mathieu