Re: On Differing Optimizer Choices ( Again) - Mailing list pgsql-sql
From | Oleg Lebedev |
---|---|
Subject | Re: On Differing Optimizer Choices ( Again) |
Date | |
Msg-id | 3BB90E21.F8E6CE80@waterford.org Whole thread Raw |
Responses |
Re: Calling Functions
|
List | pgsql-sql |
Hi, this is a very silly question, but how do I run PL/pgSQL function from SQL? I created a function test(varchar, varchar) and I am trying to invoke it with parameters 'hello', 'world' using an SQL statement. I tried selects and executes - nothing works. Please help, Oleg Mark kirkwood wrote: > Dear all, > > Tom's comments on my previous posting encouraged me think some more about > this... and now I believe got to the heart of what I was attempting to get > accross before. > > I have a fresh and hopefully clear example. > > Ok lets start with a small table called 'dim0' that has a unique key called > 'd0key' ( 10000 unique values ). Add to this a large table called 'fact2', > which has 1000 of these 'd0key' values. There are 3000 duplicates for each > value uniformly distributed throughout it. ( total of 3000000 rows ). > > Consider the query : > > SELECT > f.d0key, > count(f.val) > FROM fact2 f > WHERE f.d0key BETWEEN 270 AND 350 > GROUP BY f.d0key > > which has execution plan : > > Aggregate (cost=0.00..102500.80 rows=2592 width=8) > -> Group (cost=0.00..102436.00 rows=25920 width=8) > -> Index Scan using fact2_pk on fact2 f (cost=0.00..102371.20 > rows=25920 width=8) > > If we use 351 instead of 350 we get a sequential scan. > > Now examine a similar query, but with 'dim0' joined : > > SELECT > f.d0key, > count(f.val) > FROM dim0 d0, > fact2 f > WHERE d0.d0key = f.d0key > AND f.d0key BETWEEN 270 AND 350 > GROUP BY f.d0key > > this has plan : > > Aggregate (cost=0.00..103127.60 rows=2592 width=12) > -> Group (cost=0.00..103062.80 rows=25920 width=12) > -> Merge Join (cost=0.00..102998.00 rows=25920 width=12) > -> Index Scan using dim0_pk on dim0 d0 (cost=0.00..213.00 > rows=10000 width=4) > -> Index Scan using fact2_pk on fact2 f (cost=0.00..102371.20 > rows=25920 width=8) > > No surprises there > > (If we use 351, again we get a sequential scan used instead ). > > So far this is all as one would expect. However suppose we substitute > 'd0.d0key' in the 'AND' clause instead of 'f.d0key' to obtain : > > SELECT > f.d0key, > count(f.val) > FROM dim0 d0, > fact2 f > WHERE d0.d0key = f.d0key > AND d0.d0key BETWEEN 270 AND 350 > GROUP BY f.d0key > > Suddenly the plan is : > > Aggregate (cost=103530.27..104293.15 rows=2624 width=12) > -> Group (cost=103530.27..104227.54 rows=26244 width=12) > -> Merge Join (cost=103530.27..104161.93 rows=26244 width=12) > -> Index Scan using dim0_pk on dim0 d0 (cost=0.00..213.00 > rows=10000 width=4) > -> Sort (cost=103530.27..103530.27 rows=26244 width=8) > -> Seq Scan on fact2 f (cost=0.00..101604.00 rows=26244 > width=8) > > Now this is interesting, I would have expected an index scan to be still > used... This behavour was what I was seeing ( in disguised form ) in the > queries of the previous posting. > > ( In fact to encourage an index scan changing 350 down to 313 is required ) > > I wonder how 7.1.x behaves when faced with this situation?... a build of an > extra 7.1.3 database I reveals the corrosponding plan for this query is > (note that for 7.1.3 the magic number for index-> sequential scan is 369 > instead of 350 but bear with me) : > > Aggregate (cost=0.00..118850.17 rows=2970 width=12) > -> Group (cost=0.00..118775.91 rows=29703 width=12) > -> Nested Loop (cost=0.00..118701.66 rows=29703 width=12) > -> Index Scan using dim0_pk on dim0 d0 (cost=0.00..67.99 > rows=99 width=4) > -> Index Scan using fact2_pk on fact2 f (cost=0.00..1194.45 > rows=300 width=8) > > So that this version is using an index scan for this query ( in fact will > keep using one until after d0key=445 - so in some sense a behavour opposite > to 7.2dev is being exibited) > > Now the obvious question to ask here is "why are you are griping about using > a seq scan...? ". Timing the queries reveals that the index scan is > considerably faster : specifically 10s against 60s. Additionally 7.1.3 > performs the above query in 10s also - and even "out" at the "extreme end" > using d0.d0key=445 the elapsed time is just 15s . > > Why am I pointing this out ? - well I hope that "field testing" the optimizer > will occasionally provide food for thought ! > > regards > > Mark > > P.s : (I am using 7.2 dev 2001-08-17 and all parameters are default apart > from shared_buffers and sort_mem) > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)