Home > mailing lists

Re: slow bitmap heap scans on pg 9.2 - Mailing list pgsql-performance

From	Steve Singer
Subject	Re: slow bitmap heap scans on pg 9.2
Date	April 13, 2013 16:14:42
Msg-id	51698452.1050202@ca.afilias.info Whole thread Raw
In response to	Re: slow bitmap heap scans on pg 9.2 (Jeff Janes <jeff.janes@gmail.com>)
Responses	Re: slow bitmap heap scans on pg 9.2
List	pgsql-performance

Tree view

On 13-04-12 09:20 PM, Jeff Janes wrote:
> On Thursday, April 11, 2013, Steve Singer wrote:
>
>
>     I think the reason why it is picking the hash join based plans is
>     because of
>
>     Index Scan using table_b_1_ptid_orgid_ym_unq on table_b_1 b
>     (cost=0.00..503.86 rows=1 width=10) (actual time=0.016..0.017 rows=1
>     loops=414249)
>                          Index Cond: ((a.id <http://a.id> = a_id) AND
>     (organization_id = 2) AND (year = 2013) AND (month = 3))
>                          Filter: (product_id = 1)
>
>
>
> Trying to reason about how the planner estimates costs for the inner
> side of nested loops makes my head hurt.
> So before doing that, could you run explain (analyze,buffers) on both of
> these much simpler (but hopefully morally equivalent to this planner
> node) sql:
>
> select * from table_b_1_b where a_id = <some plausible value> and
> organization_id=2 and year=2013 and month=3
>
> select * from table_b_1_b where a_id = <some plausible value> and
> organization_id=2 and year=2013 and month=3 and product_id=1
>

table_b_1 is a partition of table_b  on product_id so when querying
table table_b_1 directly they are equivalent

explain (analyze,buffers) select * FROM table_b_1 where a_id=1128944 and
organization_id=2 and year=2013 and month=3;

QUERY PLAN


-------------------------------------------------------------------------------------------------------------------------------------------
-----------
  Index Scan using table_b_1_ptid_orgid_ym_unq on table_b_1
(cost=0.00..50.73 rows=1 width=56) (actual time=60.328..60.330 rows=
1 loops=1)
    Index Cond: ((a_id = 1128944) AND (organization_id = 2) AND (year =
2013) AND (month = 3))
    Buffers: shared hit=1 read=5
  Total runtime: 60.378 ms
(4 rows)


The plans are the same if I do or do not specify the product_id in the
where clause (if I query the parent table and neglect to query the query
clause it of course queries all the other partitions)




>
> Of particular interest here is whether the estimate of 1 row is due to
> the specificity of the filter, or if the index clauses alone are
> specific enough to drive that estimate.  (If you get many rows without
> the product_id filter, that would explain the high estimate.).

The index clauses alone , we normally expect 1 row back for a query like
that.


>
> Please run with the default cost parameters, or if you can't get the
> right plan with the defaults, specify what the used parameters were.

indexTotalCost += index->pages * spc_random_page_cost / 100000.0;

Is driving my high costs on the inner loop. The index has 2-5 million
pages depending on the partition .   If I run this against 9.2.2 with /
10000.0 the estimate is even higher.

If I try this with this with the

*indexTotalCost += log(1.0 + index->pages / 10000.0) * spc_random_page_cost;

from 9.3 and I play I can make this work I can it pick the plan on some
partitions with product_id=2 but not product_id=1.   If I remove the
fudge-factor cost adjustment line I get the nested-loop plan always.

Breaking the index into smaller partial indexes for each year seems to
be giving me the plans I want with random_page_cost=2 (I might also try
partial indexes on the month).

Even with the 9.3 log based fudge-factor we are seeing the fudge-factor
being big enough so that the planner is picking a table scan over the
index.  A lot of loop iterations can be satisfied by cached pages of the
index the fudge-factor doesn't really account for this.




>
> Cheers,
>
> Jeff

pgsql-performance by date:

From: Julian
Date: 13 April 2013, 15:37:06
Subject: Re: limit is sometimes not pushed in view with order

From: Rodrigo Barboza
Date: 13 April 2013, 17:29:40
Subject: Re: Segment best size

Re: slow bitmap heap scans on pg 9.2 - Mailing list pgsql-performance

Previous

Next