Thread: pgsql: Consider index-only scans even when there is no matching qual or
Consider index-only scans even when there is no matching qual or ORDER BY. By popular demand. Branch ------ master Details ------- http://git.postgresql.org/pg/commitdiff/600d3206d1b3f8b540397b79905486a536ac7f78 Modified Files -------------- src/backend/optimizer/path/indxpath.c | 25 ++++++++++++------------- 1 files changed, 12 insertions(+), 13 deletions(-)
Re: pgsql: Consider index-only scans even when there is no matching qual or
From
Bruce Momjian
Date:
Tom Lane wrote: > Consider index-only scans even when there is no matching qual or ORDER BY. > > By popular demand. Is this the COUNT(*) optimization? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Bruce Momjian <bruce@momjian.us> writes: > Tom Lane wrote: >> Consider index-only scans even when there is no matching qual or ORDER BY. >> >> By popular demand. > Is this the COUNT(*) optimization? Yeah, among other cases. regards, tom lane
On 11 October 2011 20:11, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Bruce Momjian <bruce@momjian.us> writes: >> Tom Lane wrote: >>> Consider index-only scans even when there is no matching qual or ORDER BY. >>> >>> By popular demand. > >> Is this the COUNT(*) optimization? > > Yeah, among other cases. This is unexpected: test=# explain analyse select count(*) from stuff; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=309724.57..309724.58 rows=1 width=0) (actual time=5622.932..5622.932 rows=1 loops=1) -> Seq Scan on stuff (cost=0.00..263974.46 rows=18300046 width=0) (actual time=0.052..3960.289 rows=18300000 loops=1) Total runtime: 5623.076 ms (3 rows) -- postgres restarted here test=# set random_page_cost = 1.0; SET test=# set seq_page_cost = 5.0; SET test=# explain analyse select count(*) from stuff; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=437191.32..437191.33 rows=1 width=0) (actual time=171652.106..171652.106 rows=1 loops=1) -> Index Only Scan using idx_stuff_thing on stuff (cost=0.00..393933.31 rows=17303202 width=0) (actual time=0.248..169062.893 rows=18300000 loops=1) Total runtime: 171652.179 ms (3 rows) So an index-only scan is 30 times slower in this particular test case. If you're curious, it was set up as so: test=# create table stuff (id serial, thing int); NOTICE: CREATE TABLE will create implicit sequence "stuff_id_seq" for serial column "stuff.id" CREATE TABLE test=# insert into stuff (thing) select ceil(random()*50) from generate_series(1,900000); INSERT 0 900000 test=# insert into stuff (thing) select ceil(random()*350) from generate_series(1,1200000); INSERT 0 1200000 test=# insert into stuff (thing) select ceil(random()*50) from generate_series(1,2200000); INSERT 0 2200000 test=# create index idx_stuff_thing on stuff (thing); CREATE INDEX test=# vacuum analyse; VACUUM test=# insert into stuff (thing) select ceil(random()*50) from generate_series(1,5000000); INSERT 0 5000000 test=# insert into stuff (thing) select ceil(random()*70) from generate_series(1,9000000); INSERT 0 9000000 test=# vacuum analyse; VACUUM If I drop the index used here, and recreate it, I get: test=# explain analyse select count(*) from stuff; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=443955.17..443955.18 rows=1 width=0) (actual time=4920.709..4920.709 rows=1 loops=1) -> Index Only Scan using idx_stuff_thing on stuff (cost=0.00..398205.06 rows=18300046 width=0) (actual time=0.330..3353.140 rows=18300000 loops=1) Total runtime: 4920.846 ms (3 rows) -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Thom Brown <thom@linux.com> writes: > So an index-only scan is 30 times slower in this particular test case. Don't see why you'd find that unexpected. If you have to visit all the rows, a seqscan is usually going to be the best way. An indexscan only has a chance of winning when the index is much smaller than the table, which isn't the case in your example, even if you hadn't seen to it that the index wasn't particularly nicely physically ordered. regards, tom lane
On 11 October 2011 21:45, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thom Brown <thom@linux.com> writes: >> So an index-only scan is 30 times slower in this particular test case. > > Don't see why you'd find that unexpected. If you have to visit all the > rows, a seqscan is usually going to be the best way. An indexscan only > has a chance of winning when the index is much smaller than the table, > which isn't the case in your example, even if you hadn't seen to it that > the index wasn't particularly nicely physically ordered. Ah okay, understood. -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company