Re: Planner doesn't take indexes into account - Mailing list pgsql-performance

From Shaun Thomas
Subject Re: Planner doesn't take indexes into account
Date
Msg-id 5385E53A.4000406@optionshouse.com
Whole thread Raw
In response to Re: Planner doesn't take indexes into account  (Grzegorz Olszewski <grzegorz.olszewski@outlook.com>)
Responses Re: Planner doesn't take indexes into account
List pgsql-performance
On 05/28/2014 04:59 AM, Grzegorz Olszewski wrote:

> There is about 500,000 rows and about 500 new rows each business day.
>
> About 96% of rows meet given conditions, that is, count shoud be about
> 480,000.

Heikki is right on this. Indexes are not a magic secret sauce that are
always used simply because they exist. Think of it like this...

If the table really matches about 480,000 rows, by forcing it to use the
index, it has to perform *at least* 480,000 random seeks. Even if you
have a high-performance SSD array that can do 100,000 random reads per
second, you will need about five seconds just to read the data.

A sequence scan can perform that same operation in a fraction of a
second because it's faster to read the entire table and filter out the
*non* matching rows.

Indexes are really only used, or useful, when the number of matches is
much lower than the row count of the table. I highly recommend reading
up on cardinality and selectivity before creating more indexes. This
page in the documentation does a really good job:

http://www.postgresql.org/docs/9.3/static/row-estimation-examples.html

--
Shaun Thomas
OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


pgsql-performance by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Planner doesn't take indexes into account
Next
From: John Melesky
Date:
Subject: Re: NFS, file system cache and shared_buffers