Re: BUG #11441: Weird (and seems wrong) behavior of partial indexes with order by/limit - Mailing list pgsql-bugs
From | Kyotaro HORIGUCHI |
---|---|
Subject | Re: BUG #11441: Weird (and seems wrong) behavior of partial indexes with order by/limit |
Date | |
Msg-id | 20140919.171544.140857218.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
In response to | Re: BUG #11441: Weird (and seems wrong) behavior of partial indexes with order by/limit (Maxim Boguk <maxim.boguk@gmail.com>) |
Responses |
Re: BUG #11441: Weird (and seems wrong) behavior of partial
indexes with order by/limit
Re: BUG #11441: Weird (and seems wrong) behavior of partial indexes with order by/limit |
List | pgsql-bugs |
Hello, I think this is a behavior as desinged. An index scans having scalar array operator on a non-first index column are considered as 'not ordered'. This is because scalar array operator on non-first index columns breaks ordering of the result tuples. Btree is the the only index capable of accepting scalar-array operators so far. > =# select amname from pg_am where amsearcharray; > amname > -------- > btree > (1 row) If my understnding is correct, it repeats scanning the index using non-array restrictions for every array element, or every possible combination of elements of multiple scalar arrays, so the index-order generally won't be preserved in the result tuples. The one obvious exception is the case of the scalar-array operation on the first index column. The value in the array is sorted before the iterations mentioned above, so the the planner can determine it to be ordered *only* for this case. The result could be ordered if the all restrictions on all index columns before scalar-array-op column are equal conditions, but the case is judged to be abandoned from the viewpoint of cost and modularitly. Therefore, the planner eliminates the sort for the following example, even though no meaning in itself. create table test as (select g.i as id, (random()*100)::integer as is_finished from generate_series(1,1000000) as g(i)); create index test_2_key on test(is_finished, id) where is_finished = ANY (ARRAY[0, 5]); vacuum analyze test; explain (costs off) select * from test where is_finished IN (0,5) order by is_finished, id limit 1; QUERY PLAN -------------------------------------------------------------- Limit -> Index Only Scan using test_2_key on test Index Cond: (is_finished = ANY ('{0,5}'::integer[])) regards, At Thu, 18 Sep 2014 21:34:07 +1000, Maxim Boguk <maxim.boguk@gmail.com> wrote in <CAK-MWwS2=8iE=BV-vPORd-+PL76HZsgC9PVzydkAUgnXXntkyQ@mail.gmail.com> > Some update now with full reproducible test case (and some surprising > results): > > Test case initialization: > > drop table if exists test; > create table test as (select g.i as id, (random()*100)::integer as > is_finished from generate_series(1,1000000) as g(i)); > create index test_1_key on test(id, is_finished) where is_finished = ANY > (ARRAY[0, 5]); > vacuum analyze test; > > Good (but not expected in that case) plan: > > explain analyze select * from test where is_finished=0 or is_finished=5 > order by id limit 1; > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=0.00..0.24 rows=1 width=8) (actual time=0.052..0.052 rows=1 > loops=1) > -> Index Only Scan using test_1_key on test (cost=0.00..4493.05 > rows=18921 width=8) (actual time=0.052..0.052 rows=1 loops=1) > Heap Fetches: 1 > Total runtime: 0.066 ms > (i'm very surprised than the PostgreSQL managed deduct is_finished = ANY > (ARRAY[0, 5]) from (is_finished=0 or is_finished=5)) > > > > Bad plan (techically the same query and even better suitable for the > partial index and should have the same plan but no luck): > > explain analyze select * from test where is_finished IN (0,5) order by id > limit 1; > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=4809.18..4809.19 rows=1 width=8) (actual time=15.410..15.410 > rows=1 loops=1) > -> Sort (cost=4809.18..4999.44 rows=19026 width=8) (actual > time=15.408..15.408 rows=1 loops=1) > Sort Key: id > Sort Method: top-N heapsort Memory: 25kB > -> Index Only Scan using test_1_key on test (cost=0.00..4428.66 > rows=19026 width=8) (actual time=0.051..12.277 rows=15222 loops=1) > Index Cond: (is_finished = ANY ('{0,5}'::integer[])) > Heap Fetches: 15222 > Total runtime: 15.469 ms -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-bugs by date: