Re: Optimization idea - Mailing list pgsql-performance

From Robert Haas
Subject Re: Optimization idea
Date
Msg-id q2w603c8f071004271746s43f8669cz45bec5914b1fa9e0@mail.gmail.com
Whole thread Raw
In response to Re: Optimization idea  (Cédric Villemain <cedric.villemain.debian@gmail.com>)
Responses Re: Optimization idea
Re: Optimization idea
List pgsql-performance
On Mon, Apr 26, 2010 at 5:33 AM, Cédric Villemain
<cedric.villemain.debian@gmail.com> wrote:
> In the first query, the planner doesn't use the information of the 2,3,4.
> It just does a : I'll bet I'll have 2 rows in t1 (I think it should
> say 3, but it doesn't)
> So it divide the estimated number of rows in the t2 table by 5
> (different values) and multiply by 2 (rows) : 40040.

I think it's doing something more complicated.  See scalararraysel().

> In the second query the planner use a different behavior : it did
> expand the value of t1.t to t2.t for each join relation and find a
> costless plan. (than the one using seqscan on t2)

I think the problem here is one we've discussed before: if the query
planner knows that something is true of x (like, say, x =
ANY('{2,3,4}')) and it also knows that x = y, it doesn't infer that
the same thing holds of y (i.e. y = ANY('{2,3,4}') unless the thing
that is known to be true of x is that x is equal to some constant.
Tom doesn't think it would be worth the additional CPU time that it
would take to make these sorts of deductions.  I'm not sure I believe
that, but I haven't tried to write the code, either.

...Robert

pgsql-performance by date:

Previous
From: Greg Spiegelberg
Date:
Subject: Re: tmpfs and postgres memory
Next
From: Robert Haas
Date:
Subject: Re: autovacuum strategy / parameters