Home > mailing lists

Re: 8.4 optimization regression? - Mailing list pgsql-performance

From	Mark Kirkwood
Subject	Re: 8.4 optimization regression?
Date	September 1, 2011 20:55:02
Msg-id	4E601B2F.30107@catalyst.net.nz Whole thread Raw
In response to	Re: 8.4 optimization regression? (Mark Kirkwood <mark.kirkwood@catalyst.net.nz>)
Responses	Re: 8.4 optimization regression?
List	pgsql-performance

Tree view

On 02/09/11 11:18, Mark Kirkwood wrote:
> On 02/09/11 11:13, Tom Lane wrote:
>> I wrote:
>>> Mark Kirkwood<mark.kirkwood@catalyst.net.nz>  writes:
>>>> [ assorted examples showing that commit
>>>> 7f3eba30c9d622d1981b1368f2d79ba0999cdff2 has got problems ]
>>> ...
>>> So, not only are you correct that we should revert the changes to
>>> eqjoinsel_inner, but what's happening in eqjoinsel_semi is wrong too.
>> I've retested these examples with the patches I committed yesterday.
>> Six of the eight examples are estimated pretty nearly dead on, while the
>> other two are estimated about 50% too high (still a lot better than
>> before).  AFAICT there's no easy way to improve those estimates further;
>> eqjoinsel_semi just plain hasn't got enough information to know how many
>> matches there will be.
>>
>>
>
> Just noticed your two commits this morning and ran them through the
> examples too - results look really good! Not only are the plain join
> queries looking way better but that last semi join that was way off is
> now being estimated pretty close. Should be interesting to see how
> much this improves more complex queries!
>
>

While this is still fresh in your mind, a couple of additional anti join
queries are still managing to sneak past estimation:

EXPLAIN ANALYZE SELECT 1 FROM nodekeyword nk WHERE nk.keywordid < 100000
AND NOT EXISTS (SELECT 1 FROM node n WHERE n.nodeid = nk.nodeid AND
n.updated > '2011-01-01'::timestamp );
                                                              QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------
  Hash Anti Join  (cost=501666.88..851597.05 rows=1 width=0) (actual
time=29956.971..50933.702 rows=5914 loops=1)
    Hash Cond: (nk.nodeid = n.nodeid)
    ->  Seq Scan on nodekeyword nk  (cost=0.00..297414.03 rows=1991560
width=4) (actual time=13.352..13765.749 rows=1999780 loops=1)
          Filter: (keywordid < 100000)
    ->  Hash  (cost=419643.00..419643.00 rows=4999510 width=4) (actual
time=29345.238..29345.238 rows=4985269 loops=1)
          Buckets: 4096  Batches: 256  Memory Usage: 699kB
          ->  Seq Scan on node n  (cost=0.00..419643.00 rows=4999510
width=4) (actual time=0.010..22731.316 rows=4985269 loops=1)
                Filter: (updated > '2011-01-01 00:00:00'::timestamp
without time zone)

EXPLAIN ANALYZE SELECT 1 FROM nodekeyword nk WHERE nk.keywordid < 10000
AND NOT EXISTS (SELECT 1 FROM node n WHERE n.nodeid = nk.nodeid AND
n.updated > '2011-01-01'::timestamp );
                                                             QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------
  Hash Anti Join  (cost=501666.88..821806.96 rows=1 width=0) (actual
time=46497.231..49196.057 rows=566 loops=1)
    Hash Cond: (nk.nodeid = n.nodeid)
    ->  Seq Scan on nodekeyword nk  (cost=0.00..297414.03 rows=192921
width=4) (actual time=19.916..16250.224 rows=199616 loops=1)
          Filter: (keywordid < 10000)
    ->  Hash  (cost=419643.00..419643.00 rows=4999510 width=4) (actual
time=29901.178..29901.178 rows=4985269 loops=1)
          Buckets: 4096  Batches: 256  Memory Usage: 699kB
          ->  Seq Scan on node n  (cost=0.00..419643.00 rows=4999510
width=4) (actual time=0.008..23207.964 rows=4985269 loops=1)
                Filter: (updated > '2011-01-01 00:00:00'::timestamp
without time zone)

pgsql-performance by date:

From: Stefan Keller
Date: 01 September 2011, 20:26:35
Subject: Re: Summaries on SSD usage?

From: Jesper Krogh
Date: 02 September 2011, 01:15:07
Subject: Re: Summaries on SSD usage?

Re: 8.4 optimization regression? - Mailing list pgsql-performance

Previous

Next