Re: Weird indices - Mailing list pgsql-general

From Tom Lane
Subject Re: Weird indices
Date
Msg-id 10485.982633722@sss.pgh.pa.us
Whole thread Raw
In response to Re: Weird indices  (Joseph Shraibman <jks@selectacast.net>)
List pgsql-general
Joseph Shraibman <jks@selectacast.net> writes:
> Stephan Szabo wrote:
>> Do you have a value that is not null that is very common?
>> It's estimating that there will be 10113 rows that match
>> nomsession='xxx' which makes a seq scan a much less bad plan.
>>
> Err, why?  There is an index, isn't there?  Shouldn't the index allow
> postgres to quickly find the %2 of rows that would match?

Define "quickly".

> sitefr=# explain select nomsession from session where nomsession='xxx';
> NOTICE:  QUERY PLAN:
>
> Seq Scan on session  (cost=0.00..16275.95 rows=10113 width=12)

We have here an estimate that 10113 rows will be matched (out of the
510069 in the table).  The table contains something on the order of
16000 pages (guesstimate from the seqscan cost estimate).  The planner
is assuming that the 10113 rows are randomly scattered in the table,
and therefore that the executor will have to fetch the majority of the
pages in the table.  Under these circumstances a seqscan is cheaper
than an indexscan, because it works with the Unix kernel's preference
for sequential reads (to say nothing of the disk drive's ;-)), instead
of fighting that optimization.  Random fetches are more than twice as
expensive as sequential fetches.

Of course, if the 10113-match estimate is wildly off (as it was in this
case), then the wrong plan may be chosen.  But it IS NOT CORRECT to
suppose that indexscans always beat seqscans.  The planner's job would
be a lot easier if that were true.

            regards, tom lane

pgsql-general by date:

Previous
From: Stephan Szabo
Date:
Subject: Re: Weird indices
Next
From: Joseph Shraibman
Date:
Subject: Re: Weird indices