Re: ORDER BY, LIMIT and indexes - Mailing list pgsql-performance
From | Mark Kirkwood |
---|---|
Subject | Re: ORDER BY, LIMIT and indexes |
Date | |
Msg-id | 52017F1F.7020700@catalyst.net.nz Whole thread Raw |
In response to | Re: ORDER BY, LIMIT and indexes (Ivan Voras <ivoras@freebsd.org>) |
Responses |
Re: ORDER BY, LIMIT and indexes
|
List | pgsql-performance |
On 06/08/13 22:46, Ivan Voras wrote: > Here are two more unexpected results. Same test table (1 mil. records, > "id" is SERIAL PRIMARY KEY, PostgreSQL 9.1, VACUUM ANALYZE performed > before the experiments): > > ivoras=# explain analyze select * from lt where id > 900000 limit 10; > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------- > Limit (cost=0.00..1.71 rows=10 width=9) (actual > time=142.669..142.680 rows=10 loops=1) > -> Seq Scan on lt (cost=0.00..17402.00 rows=101630 width=9) > (actual time=142.665..142.672 rows=10 loops=1) > Filter: (id > 900000) > Total runtime: 142.735 ms > (4 rows) > > Note the Seq Scan. > > ivoras=# explain analyze select * from lt where id > 900000; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------- > Bitmap Heap Scan on lt (cost=1683.97..7856.35 rows=101630 width=9) > (actual time=38.462..85.780 rows=100000 loops=1) > Recheck Cond: (id > 900000) > -> Bitmap Index Scan on lt_pkey (cost=0.00..1658.56 rows=101630 > width=0) (actual time=38.310..38.310 rows=100000 loops=1) > Index Cond: (id > 900000) > Total runtime: 115.674 ms > (5 rows) > > This somewhat explains the above case - we are simply fetching 100,000 > records here, and it's slow enough even with the index scan, so > planner skips the index in the former case. BUT, if it did use the > index, it would have been expectedly fast: > > ivoras=# set enable_seqscan to off; > SET > ivoras=# explain analyze select * from lt where id > 900000 limit 10; > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------------------- > Limit (cost=0.00..1.74 rows=10 width=9) (actual time=0.081..0.112 > rows=10 loops=1) > -> Index Scan using lt_pkey on lt (cost=0.00..17644.17 > rows=101630 width=9) (actual time=0.078..0.100 rows=10 loops=1) > Index Cond: (id > 900000) > Total runtime: 0.175 ms > (4 rows) > > It looks like the problem is in the difference between what the > planner expects and what the Filter or Index operations deliver: > (cost=0.00..17402.00 rows=101630 width=9) (actual > time=142.665..142.672 rows=10 loops=1). > > Hmm - I wonder if the lack or ORDER BY is part of the problem here. Consider a similar query on pgbench_accounts: bench=# explain analyze select aid from pgbench_accounts where aid > 100000 limit 20; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..0.91 rows=20 width=4) (actual time=0.005..0.464 rows=20 loops=1) -> Seq Scan on pgbench_accounts (cost=0.00..499187.31 rows=10994846 width=4) (actual time=0.005..0.463 rows=20 loops=1) Filter: (aid > 100000) Total runtime: 0.474 ms (4 rows) bench=# explain analyze select aid from pgbench_accounts where aid > 10000000 limit 20; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..2.25 rows=20 width=4) (actual time=0.014..0.018 rows=20 loops=1) -> Index Scan using pgbench_accounts_pkey on pgbench_accounts (cost=0.00..207204.06 rows=1844004 width=4) (actual time=0.014..0.017 rows=20 loops=1) Index Cond: (aid > 10000000) Total runtime: 0.030 ms (4 rows) So at some point you get index scans. Now add an ORDER BY: bench=# explain analyze select aid from pgbench_accounts where aid > 100000 order by aid limit 20; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------- -- Limit (cost=0.00..2.25 rows=20 width=4) (actual time=0.008..0.012 rows=20 loops=1) -> Index Scan using pgbench_accounts_pkey on pgbench_accounts (cost=0.00..1235355.34 rows=10994846 width=4) (actual time=0.008..0.011 rows=20 loops=1 ) Index Cond: (aid > 100000) Total runtime: 0.023 ms (4 rows) bench=# explain analyze select aid from pgbench_accounts where aid > 10000000 order by aid limit 20; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..2.25 rows=20 width=4) (actual time=0.014..0.018 rows=20 loops=1) -> Index Scan using pgbench_accounts_pkey on pgbench_accounts (cost=0.00..207204.06 rows=1844004 width=4) (actual time=0.014..0.016 rows=20 loops=1) Index Cond: (aid > 10000000) Total runtime: 0.029 ms (4 rows) ...and we have index scans for both cases. Cheers Mark
pgsql-performance by date: