Home > mailing lists

How to force Nested Loop plan? - Mailing list pgsql-performance

From	Rob Nagler
Subject	How to force Nested Loop plan?
Date	August 30, 2003 10:36:20
Msg-id	16208.41885.960000.12706@gargle.gargle.HOWL Whole thread Raw
Responses	Re: How to force Nested Loop plan?
List	pgsql-performance

Tree view

I'm trying to understand how I can get the planner to always do the
right thing with this query:

    EXPLAIN ANALYZE
    SELECT
    aa_t.min_date_time
    FROM
    aa_t
    , bb_t
    , cc_t
    WHERE bb_t.bb_id = aa_t.bb_id
    AND aa_t.realm_id = cc_t.realm_id
    AND aa_t.server_id = 21
    ORDER BY aa_t.min_date_time desc
    LIMIT 1
    OFFSET 674
    ;

There's a extreme elbow in the performance curve around the sum of
LIMIT and OFFSET.  The two plans follow.  First for the query above:

 Limit  (cost=21569.56..21601.56 rows=1 width=84) (actual time=59.60..59.69 rows=1 loops=1)
   ->  Nested Loop  (cost=0.00..110535.66 rows=3454 width=84) (actual time=0.19..59.20 rows=676 loops=1)
         ->  Nested Loop  (cost=0.00..93177.46 rows=3454 width=65) (actual time=0.14..44.41 rows=676 loops=1)
               ->  Index Scan Backward using aa_t20 on aa_t  (cost=0.00..76738.77 rows=3454 width=46) (actual
time=0.10..31.30rows=676 loops=1) 
                     Filter: (server_id = 21::numeric)
               ->  Index Scan using cc_t1 on cc_t  (cost=0.00..4.75 rows=1 width=19) (actual time=0.01..0.01 rows=1
loops=676)
                     Index Cond: ("outer".realm_id = cc_t.realm_id)
         ->  Index Scan using bb_t1 on bb_t  (cost=0.00..5.01 rows=1 width=19) (actual time=0.02..0.02 rows=1
loops=676)
               Index Cond: (bb_t.bb_id = "outer".bb_id)
 Total runtime: 59.89 msec
(10 rows)

Setting OFFSET to 675 in the above query, results in this 100 times
slower plan:

 Limit  (cost=21614.48..21614.48 rows=1 width=84) (actual time=4762.39..4762.39 rows=1 loops=1)
   ->  Sort  (cost=21612.79..21621.42 rows=3454 width=84) (actual time=4761.45..4761.92 rows=677 loops=1)
         Sort Key: aa_t.min_date_time
         ->  Merge Join  (cost=21139.96..21409.80 rows=3454 width=84) (actual time=4399.80..4685.24 rows=41879 loops=1)
               Merge Cond: ("outer".bb_id = "inner".bb_id)
               ->  Sort  (cost=8079.83..8184.53 rows=41879 width=19) (actual time=936.99..967.37 rows=41879 loops=1)
                     Sort Key: bb_t.bb_id
                     ->  Seq Scan on bb_t  (cost=0.00..4864.79 rows=41879 width=19) (actual time=0.06..729.60
rows=41879loops=1) 
               ->  Sort  (cost=13060.13..13068.76 rows=3454 width=65) (actual time=3462.76..3493.97 rows=41879 loops=1)
                     Sort Key: aa_t.bb_id
                     ->  Merge Join  (cost=12794.42..12857.14 rows=3454 width=65) (actual time=2923.62..3202.78
rows=41879loops=1) 
                           Merge Cond: ("outer".realm_id = "inner".realm_id)
                           ->  Sort  (cost=12762.78..12771.41 rows=3454 width=46) (actual time=2920.78..2950.87
rows=41879loops=1) 
                                 Sort Key: aa_t.realm_id
                                 ->  Index Scan using aa_t5 on aa_t  (cost=0.00..12559.79 rows=3454 width=46) (actual
time=0.18..2589.22rows=41879 loops=1) 
                                       Index Cond: (server_id = 21::numeric)
                           ->  Sort  (cost=31.64..32.78 rows=455 width=19) (actual time=2.54..33.12 rows=42163 loops=1)
                                 Sort Key: cc_t.realm_id
                                 ->  Seq Scan on cc_t  (cost=0.00..11.55 rows=455 width=19) (actual time=0.04..0.86
rows=455loops=1) 
 Total runtime: 4792.84 msec
(20 rows)

Twiddling effective_cache_size and random_page_cost allows for a large
LIMIT+OFFSET number but not enough.  These tests are made with 400000
effective_cache_size and random_page_cost of 4.

I can increase the LIMIT+OFFSET elbow to 1654 by changing the
query thusly:

< AND aa_t.server_id = 21
---
> AND aa_t.server_id IN (21, 0)

The value 0 is an invalid server_id, so I know it won't be returned.
However, I've got 41K rows that could be returned by this query and
growing, and 1654 is obviously not enough.  (aa is 690K rows, bb is
41K rows, and cc is 500 rows.)

If I drop the ORDER BY, the query goes much faster, but the query is
useless without the ORDER BY.

I've figured out that the second plan is slow, because it is writing a
huge result set to disk (+200MB).  This doesn't make sense to me,
since sort_mem is 32000.

Is there a way to tell the optimizer to use Nested Loop plan always
instead of the Merge/Join plan?  Turning off enable_mergejoin is
obviously not an option.

Thanks,
Rob

pgsql-performance by date:

From: Richard Jones
Date: 30 August 2003, 10:32:05
Subject: Re: Selecting random rows efficiently

From: Ken Geis
Date: 30 August 2003, 10:45:51
Subject: Re: bad estimates

How to force Nested Loop plan? - Mailing list pgsql-performance

Previous

Next