New Optimizer Behaviour In 7.0b1 - Mailing list pgsql-sql
From | Mark Kirkwood |
---|---|
Subject | New Optimizer Behaviour In 7.0b1 |
Date | |
Msg-id | 38B77942.8B5B6A85@ihug.co.nz Whole thread Raw |
Responses |
Re: [SQL] New Optimizer Behaviour In 7.0b1
|
List | pgsql-sql |
I have been playing a bit with this new release. There are lots of new possible plans, which is really great. I have been using the query shown below to study optimizer changes. select d0.d0f1, count(f.f1) from dim0 d0, fact1 f where d0.d0key = f.d0key and d0.d0f1 between '1999-11-01' and '1999-12-01' group by d0.d0f1 Table setup is : Table "fact1" size 300000 rowsAttribute | Type | Modifier -----------+---------+----------d0key | integer |d1key | integer |f1 | integer | Index: fact1_q1 on d0key Table "dim0" size 900 rowsAttribute | Type | Modifier -----------+-------------+----------d0key | integer |d0f1 | timestamp |d0f2 | varchar(20) |d0f3 | varchar(20) | Indices: dim0_pk on d0key, dim0_q1 on d0f1 Explain is : Aggregate (cost=12205.78..12372.44 rows=3333 width=20) -> Group (cost=12205.78..12289.11 rows=33333 width=20) -> Sort (cost=12205.78..12205.78 rows=33333 width=20) -> Hash Join (cost=21.75..9371.33 rows=33333 width=20) -> Seq Scan on fact1 f (cost=0.00..4765.00 rows=300000 width=8) -> Hash (cost=21.50..21.50 rows=100 width=12) -> Seq Scanon dim0 d0 (cost=0.00..21.50 rows=100 width=12) Initially this ran fairly slowly : 8-10s , the query scans about 9000 out the 300000 in the big table(fact1). A bit of tweeking with the set variables : ( these are new -see src/backend/commands/variable.c ) set cpu_tuple_cost = '0.6'; set enable_hashjoin = 'off'; set enable_mergejoin = 'off'; gave a new plan : Aggregate (cost=0.00..18476945.83 rows=3333 width=20) -> Group (cost=0.00..18476862.50 rows=33333 width=20) -> Nested Loop (cost=0.00..18476779.16 rows=33333 width=20) -> Index Scan using dim0_q1 on dim0 d0 (cost=0.00..81.98 rows=100 width=12) -> Index Scan using fact1_q1 on fact1 f (cost=0.00..4016.97 rows=1500 width=8) which is devestatiingly fast... about 1 s. Note that the table order is reversed and that the index on the big table ( fact1) is used. However it seems a bit on the brutal side to have to coerce the optimizer this way ( after all hash joins are generally good), is there any way to get a reasonably sensible use of indexes without such desperate measures ? P.s : I realize that this is beta 1..... I am impressed, I have had no problems relinking php4 and subsequently apache for use with this release - seems like a very good quality beta 1. well done guys! Mark (markir@ihug.co.nz,mark.kirkwood@hnz.co.nz ) P.p.s : hopefully this is not going to appear twice on this list, my first send bounced.