Home > mailing lists

Re: Optimization idea - Mailing list pgsql-performance

From	Vlad Arkhipov
Subject	Re: Optimization idea
Date	April 26, 2010 00:26:27
Msg-id	4BD4FFE7.9090305@dc.baikal.ru Whole thread Raw
In response to	Re: Optimization idea (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: Optimization idea
List	pgsql-performance

Tree view

> On Thu, Apr 22, 2010 at 10:37 PM, Vlad Arkhipov <arhipov@dc.baikal.ru> wrote:
>
>> I don't think this is just an issue with statistics, because the same
>> problem arises when I try executing a query like this:
>>
>
> I'm not sure how you think this proves that it isn't a problem with
> statistics, but I think what you should be focusing on here, looking
> back to your original email, is that the plans that are actually much
> faster have almost as much estimated cost as the slower one.  Since
> all your data is probably fully cached, at a first cut, I might try
> setting random_page_cost and seq_page_cost to 0.005 or so, and
> adjusting effective_cache_size to something appropriate.
>
> ...Robert
>
>

Ok. I thougth it's quite obvious because of these two queries. I can't
understand why the estimated rows count is 40040 in the first plan.

test=# explain analyze select * from t2 join t1 on t1.t = t2.t where
t1.t in (2,3,4);
                                                    QUERY PLAN
------------------------------------------------------------------------------------------------------------------
Hash Join  (cost=1.09..2319.87 rows=40040 width=32) (actual
time=0.050..356.269 rows=400 loops=1)
   Hash Cond: (t2.t = t1.t)
   ->  Seq Scan on t2  (cost=0.00..1543.00 rows=100100 width=16) (actual
time=0.013..176.087 rows=100100 loops=1)
   ->  Hash  (cost=1.07..1.07 rows=2 width=16) (actual time=0.023..0.023
rows=3 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 1kB
         ->  Seq Scan on t1  (cost=0.00..1.07 rows=2 width=16) (actual
time=0.006..0.014 rows=3 loops=1)
               Filter: (t = ANY ('{2,3,4}'::bigint[]))
Total runtime: 356.971 ms
(8 rows)

test=# explain analyze select * from t2 join t1 on t1.t = t2.t where
t1.t = 2 union all select * from t2 join t1 on t1.t = t2.t where t1.t =
3 union all select * from t2 join t1 on t1.t = t2.t where t1.t = 4;
                                                         QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------
Append  (cost=0.00..112.42 rows=407 width=32) (actual time=0.048..3.487
rows=400 loops=1)
   ->  Nested Loop  (cost=0.00..47.51 rows=197 width=32) (actual
time=0.045..1.061 rows=200 loops=1)
         ->  Seq Scan on t1  (cost=0.00..1.06 rows=1 width=16) (actual
time=0.011..0.014 rows=1 loops=1)
               Filter: (t = 2)
         ->  Index Scan using t_idx on t2  (cost=0.00..44.48 rows=197
width=16) (actual time=0.026..0.382 rows=200 loops=1)
               Index Cond: (pg_temp_2.t2.t = 2)
   ->  Nested Loop  (cost=0.00..32.67 rows=117 width=32) (actual
time=0.019..0.599 rows=100 loops=1)
         ->  Seq Scan on t1  (cost=0.00..1.06 rows=1 width=16) (actual
time=0.003..0.006 rows=1 loops=1)
               Filter: (t = 3)
         ->  Index Scan using t_idx on t2  (cost=0.00..30.43 rows=117
width=16) (actual time=0.010..0.211 rows=100 loops=1)
               Index Cond: (pg_temp_2.t2.t = 3)
   ->  Nested Loop  (cost=0.00..28.17 rows=93 width=32) (actual
time=0.017..0.534 rows=100 loops=1)
         ->  Seq Scan on t1  (cost=0.00..1.06 rows=1 width=16) (actual
time=0.005..0.008 rows=1 loops=1)
               Filter: (t = 4)
         ->  Index Scan using t_idx on t2  (cost=0.00..26.18 rows=93
width=16) (actual time=0.007..0.187 rows=100 loops=1)
               Index Cond: (pg_temp_2.t2.t = 4)
Total runtime: 4.190 ms
(17 rows)

pgsql-performance by date:

From: Merlin Moncure
Date: 24 April 2010, 12:20:26
Subject: Re: Replacing Cursors with Temporary Tables

From: Cédric Villemain
Date: 26 April 2010, 06:34:02
Subject: Re: Optimization idea

Re: Optimization idea - Mailing list pgsql-performance

Previous

Next