Home > mailing lists

PATCH: index-only scans with partial indexes - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	PATCH: index-only scans with partial indexes
Date	July 10, 2015 18:29:54
Msg-id	55A00F17.1020608@2ndquadrant.com Whole thread Raw
Responses	Re: PATCH: index-only scans with partial indexes Re: PATCH: index-only scans with partial indexes
List	pgsql-hackers

Tree view

Hi,

currently partial indexes end up not using index only scans in most
cases, because check_index_only() is overly conservative, as explained
in this comment:

  * XXX this is overly conservative for partial indexes, since we will
  * consider attributes involved in the index predicate as required even
  * though the predicate won't need to be checked at runtime. (The same
  * is true for attributes used only in index quals, if we are certain
  * that the index is not lossy.)  However, it would be quite expensive
  * to determine that accurately at this point, so for now we take the
  * easy way out.

In other words, unless you include columns from the index predicate to
the index, the planner will decide index only scans are not possible.
Which is a bit unfortunate, because those columns are not needed at
runtime, and will only increase the index size (and the main benefit of
partial indexes is size reduction).

The attached patch fixes this by only considering clauses that are not
implied by the index predicate. The effect is simple:

     create table t as select i as a, i as b from
                       generate_series(1,10000000) s(i);

     create index tidx_partial on t(b) where a > 1000 and a < 2000;

     vacuum freeze t;
     analyze t;

explain analyze select count(b) from t where a > 1000 and a < 2000;

                                 QUERY PLAN
-----------------------------------------------------------------------
  Aggregate  (cost=39.44..39.45 rows=1 width=4)
             (actual time=8.350..8.354 rows=1 loops=1)
    ->  Index Scan using tidx_partial on t
             (cost=0.28..37.98 rows=585 width=4)
             (actual time=0.034..4.368 rows=999 loops=1)
  Planning time: 0.197 ms
  Execution time: 8.441 ms
(4 rows)

explain analyze select count(b) from t where a > 1000 and a < 2000;

                                 QUERY PLAN
-----------------------------------------------------------------------
  Aggregate  (cost=33.44..33.45 rows=1 width=4)
             (actual time=8.019..8.023 rows=1 loops=1)
    ->  Index Only Scan using tidx_partial on t
             (cost=0.28..31.98 rows=585 width=4)
             (actual time=0.036..4.165 rows=999 loops=1)
          Heap Fetches: 0
  Planning time: 0.188 ms
  Execution time: 8.106 ms
(5 rows)


I've done a bunch of tests, and I do see small (hardly noticeable)
increase in planning time with long list of WHERE clauses, because all
those need to be checked against the index predicate. Not sure if this
is what's meant by 'quite expensive' in the comment. Moreover, this was
more than compensated by the IOS benefits (even with everything in RAM).

But maybe it's possible to fix that somehow? For example, we're
certainly doing those checks elsewhere when deciding which clauses need
to be evaluated at run-time, so maybe we could cache that somehow?

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

partial-index-only-scan-v1.patch

pgsql-hackers by date:

From: Mike Blackwell
Date: 10 July 2015, 18:01:23
Subject: Re: pg_upgrade + Ubuntu

From: Smitha Pamujula
Date: 10 July 2015, 18:32:17
Subject: Re: pg_upgrade + Extensions

PATCH: index-only scans with partial indexes - Mailing list pgsql-hackers

Attachment

Previous

Next