In the following query, We are seeing a sub-optimal plan being chosen. The following results are after running the query several times (after each change).
dev1=# select version(); version ----------------------------------------------------------------------------------------------------------- PostgreSQL 8.2.13 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20070626 (Red Hat 4.1.2-14)
dev1=# EXPLAIN ANALYZE SELECT SUM (revenue) as revenue FROM statsdaily WHERE ofid = 38 AND date >= '2009-09-01' AND date <= '2999-01-01'; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=11796.19..11796.20 rows=1 width=8) (actual time=28.598..28.599 rows=1 loops=1) -> Index Scan using statsdaily_unique_idx on statsdaily (cost=0.00..11783.65 rows=5017 width=8) (actual time=0.043..25.374 rows=3125 loops=1) Index Cond: ((date >= '2009-09-01'::date) AND (date <= '2999-01-01'::date) AND (ofid = 38)) Total runtime: 28.650 ms
dev1=# set enable_indexscan to off;
dev1=# EXPLAIN ANALYZE SELECT SUM (revenue) as revenue FROM statsdaily WHERE ofid = '38' AND date >= '2009-09-01' AND date <= '2999-01-01'; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=13153.47..13153.48 rows=1 width=8) (actual time=7.746..7.747 rows=1 loops=1) -> Bitmap Heap Scan on statsdaily (cost=3622.22..13140.92 rows=5017 width=8) (actual time=0.941..4.865 rows=3125 loops=1) Recheck Cond: ((ofid = 38) AND (date >= '2009-09-01'::date)) Filter: (date <= '2999-01-01'::date) -> Bitmap Index Scan on statsdaily_ofid_sept2009_idx (cost=0.00..3620.97 rows=5046 width=0) (actual time=0.551..0.551 rows=3125 loops=1) Index Cond: (ofid = 38) Total runtime: 7.775 ms
default_statistics_target = 100 (tried with 500, no change). Vacuum analyzed before initial query, and after each change to default_statistics_target.
The same query, with a different "ofid", will occasionally get the more optimal plan -- I assume that the distribution of data is the differentiator there.
Is there any other data I can provide to shed some light on this?