Home > mailing lists

Strange planner decision on quite simple select - Mailing list pgsql-performance

From	Markus Wollny
Subject	Strange planner decision on quite simple select
Date	October 25, 2005 06:48:06
Msg-id	2266D0630E43BB4290742247C891057508383E8E@dozer.computec.de Whole thread Raw
Responses	Re: Strange planner decision on quite simple select
List	pgsql-performance

Tree view

Hello!

I've got a table BOARD_MESSAGE (message_id int8, thread_id int8, ...)
with pk on message_id and and a non_unique not_null index on thread_id.
A count(*) on BOARD_MESSAGE currently yields a total of 1231171 rows,
the planner estimated a total of 1232530 rows in this table. I've got
pg_autovacuum running on the database and run an additional nightly
VACUUM ANALYZE over it every night.

I've got a few queries of the following type:

select          *
                                from PUBLIC.BOARD_MESSAGE
                                where THREAD_ID = 3354253
                                order by        MESSAGE_ID asc
                                limit           20
                                offset          0;


There are currently roughly 4500 rows with this thread_id in
BOARD_MESSAGE. Explain-output is like so:

                                                      QUERY PLAN

------------------------------------------------------------------------
----------------------------------------------
 Limit  (cost=0.00..3927.22 rows=20 width=1148)
   ->  Index Scan using pk_board_message on board_message
(cost=0.00..1100800.55 rows=5606 width=1148)
         Filter: (thread_id = 3354253)
(3 rows)

I didn't have the patience to actually complete an explain analyze on
that one - I cancelled the query on several attempts after more than 40
minutes runtime. Now I fiddled a little with this statement and tried
nudging the planner in the right direction like so:

explain analyze select * from (select          *
                                from PUBLIC.BOARD_MESSAGE
                                where THREAD_ID = 3354253
                                order by        MESSAGE_ID asc ) as foo
                                limit           20
                                offset          0;

QUERY PLAN


------------------------------------------------------------------------
------------------------------------------------------------------------
-------------------------

 Limit  (cost=8083.59..8083.84 rows=20 width=464) (actual
time=1497.455..1498.466 rows=20 loops=1)
   ->  Subquery Scan foo  (cost=8083.59..8153.67 rows=5606 width=464)
(actual time=1497.447..1498.408 rows=20 loops=1)
         ->  Sort  (cost=8083.59..8097.61 rows=5606 width=1148) (actual
time=1497.326..1497.353 rows=20 loops=1)
               Sort Key: message_id
               ->  Index Scan using nidx_bm_thread_id on board_message
(cost=0.00..7734.54 rows=5606 width=1148) (actual time=0.283..1431.752
rows=4215 loops=1)

                     Index Cond: (thread_id = 3354253)
 Total runtime: 1502.138 ms

Now this is much more like it. As far as I interpret the explain output,
in the former case the planner decides to just sort the whole table with
it's 1.2m rows by it's primary key on message_id and then filters out
the few thousand rows matching the requested thread_id. In the latter
case, it selects the few thousand rows with the matching thread_id
_first_ and _then_ sorts them according to their message_id. The former
attempt involves sorting of more than a million rows and then filtering
through the result, the latter just uses the index to retrieve a few
thousand rows and sorts those - which is much more efficient.

What's more puzzling is that the results vary somewhat depending on the
overall load situation. When using the first approach without the
subselect, sometimes the planner chooses exactly the same plan as it
does with the second approach - with equally satisfying results in
regard to total execution time; sometimes it does use the first plan and
does complete with a very acceptable execution time, too. But sometimes
(when overall load is sufficiently high, I presume) it just runs and
runs for minutes on end - I've had this thing running for more than one
hour on several occasions until I made some changes to my app which
limits the maximum execution time for a query to no more than 55
seconds.

With this IMHO quite ugly subselect-workaround, performance is
reproducably stable and sufficiently good under either load, so I chose
to stick with it for the time being - but I'd still like to know if I
could have done anything to have the planner choose the evidently better
plan for the first query without such a workaround?

Kind regards

   Markus

pgsql-performance by date:

From: Kishore B
Date: 25 October 2005, 05:47:59
Subject: Why Index is not working on date columns.

From: Richard Huxton
Date: 25 October 2005, 07:10:24
Subject: Re: Strange planner decision on quite simple select

Strange planner decision on quite simple select - Mailing list pgsql-performance

Previous

Next