Re: Bad Planner Statistics for Uneven distribution. - Mailing list pgsql-performance

From Guillaume Smet
Subject Re: Bad Planner Statistics for Uneven distribution.
Date
Msg-id 1d4e0c10607211500t62be09f2v8284ceb5176b01bc@mail.gmail.com
Whole thread Raw
In response to Re: Bad Planner Statistics for Uneven distribution.  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Bad Planner Statistics for Uneven distribution.
List pgsql-performance
Tom,

On 7/21/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> It's really not possible for a full-table indexscan to be faster than a
> seqscan, and not very credible for it even to be approximately as fast.
> I suspect your second query here is the beneficiary of the first query
> having fetched all the pages into cache.  In general, if you want to
> optimize for a mostly-cached database, you need to reduce
> random_page_cost below its default value ...

We discussed this case on IRC and the problem was not the first set of
queries but the second one:
select brand_id from brands where exists (select 1 from models_brands
where brand = brands.brand_id);).

Isn't there any way to make PostgreSQL have a better estimation here:
->  Index Scan using models_brands_brand on models_brands
(cost=0.00..216410.97 rows=92372 width=0) (actual time=0.008..0.008
rows=0 loops=303)
           Index Cond: (brand = $0)

I suppose it's because the planner estimates that there will be 92372
result rows that it chooses the seqscan instead of the index scan.
ALTER STATISTICS didn't change anything.
IIRC, there were already a few threads about the same sort of
estimation problem and there wasn't any solution to solve this
problem. Do you have any hint/ideas?

--
Guillaume

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Bad Planner Statistics for Uneven distribution.
Next
From: Robert Lor
Date:
Subject: Re: Sun Donated a Sun Fire T2000 to the PostgreSQL