Re: Postgres not willing to use an index? - Mailing list pgsql-performance
From | Mario Splivalo |
---|---|
Subject | Re: Postgres not willing to use an index? |
Date | |
Msg-id | 498FFA1D.50003@megafon.hr Whole thread Raw |
In response to | Re: Postgres not willing to use an index? (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Postgres not willing to use an index?
|
List | pgsql-performance |
Robert Haas wrote: > On Fri, Feb 6, 2009 at 12:41 PM, Kevin Grittner > <Kevin.Grittner@wicourts.gov> wrote: >>>>> Robert Haas <robertmhaas@gmail.com> wrote: >>> What's weird about this example is that when he sets enable_seqscan to >>> off, the bitmap index scan plan is actually substantially faster, even >>> though it in fact does scan nearly the entire heap. I don't >>> understand how it can be faster to scan the index and the heap than to >>> just scan the heap. >> It's cached in the second test, maybe? > > I gather that the results were repeatable, but perhaps Mario could > double-check that? I think that it is always cached - the machine has 4GB of RAM, and i'm just using it for testing. Now, I think that the cache is used because there is no I/O wait when I run the queries (if you have any suggestion on how to check cache usage, since I have no idea): jura=# set enable_seqscan to true; SET jura=# explain analyze select * from transactions where transaction_time_commit between '2008-01-01' and '2008-01-31 23:59:59'; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Seq Scan on transactions (cost=0.00..418365.68 rows=759775 width=91) (actual time=928.342..3788.232 rows=722176 loops=1) Filter: ((transaction_time_commit >= '2008-01-01 00:00:00+01'::timestamp with time zone) AND (transaction_time_commit <= '2008-01-31 23:59:59+01'::timestamp with time zone)) Total runtime: 3936.744 ms (3 rows) jura=# set enable_seqscan to false; SET jura=# explain analyze select * from transactions where transaction_time_commit between '2008-01-01' and '2008-01-31 23:59:59'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on transactions (cost=428882.89..651630.52 rows=759775 width=91) (actual time=1358.040..1633.867 rows=722176 loops=1) Recheck Cond: ((transaction_time_commit >= '2008-01-01 00:00:00+01'::timestamp with time zone) AND (transaction_time_commit <= '2008-01-31 23:59:59+01'::timestamp with time zone)) -> Bitmap Index Scan on transactions_idx__client_data (cost=0.00..428692.95 rows=759775 width=0) (actual time=1354.485..1354.485 rows=722176 loops=1) Index Cond: ((transaction_time_commit >= '2008-01-01 00:00:00+01'::timestamp with time zone) AND (transaction_time_commit <= '2008-01-31 23:59:59+01'::timestamp with time zone)) Total runtime: 1778.938 ms (5 rows) Now, transactions_idx__client_data index has transaction_time_commit as the last column in index. When I 'recreate' the database, and run the queries again, first run which uses sequential scan is around 10 seconds, heavy I/O, any subsequent query run takes cca 3900 msecs, as shown above. When I say 'disable seqscan', planner uses Bitmap Index Scan, as shown above, just that the first query takes around 25 seconds to run, with heavy I/O. Any subsequent query runs take somewhat less than 2 seconds, as shown above. I'm not sure on what to do to minimize the impact of the OS-cache, apart from taking RAM modules out of the machine - if you have any suggestions I'll try to apply them. On production database I changed the index so that 'transaction_time_commit' is the first column, and now I don't get any sequential scans on transactions table when only WHERE condition is on transaction_time_commit. Mike
pgsql-performance by date: