Re: Use unique index for longer pathkeys. - Mailing list pgsql-hackers
From | Kyotaro HORIGUCHI |
---|---|
Subject | Re: Use unique index for longer pathkeys. |
Date | |
Msg-id | 20140728.184757.261051744.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
In response to | Re: Use unique index for longer pathkeys. (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Use unique index for longer pathkeys.
|
List | pgsql-hackers |
Hello, > > > I think there is one more disadvantage in the way current patch is > > > done which is that you need to collect index path keys for all relations > > > irrespective of whether they will be of any use to eliminate useless > > > pathkeys from query_pathkeys. One trivial case that comes to mind is > > > when there are multiple relations involved in query and ORDER BY is > > > base on columns of only part of the tables involved in query. > > > > Like this? > > > > select x.a, x.b, y.b from x, y where x.a = y.a order by x.a, x.b; > > > > Equivalent class consists of (x.a=y.a) and (x.b), so index > > pathkeys for i_y is (y.a.=x.a). As a result, no common primary > > pathkeys found. > > I think it will find common pathkey incase you have an unique index > on x.a (please see the example below), but currently I am not clear > why there is a need for a common index path key in such a case to > eliminate useless keys in ORDER BY, why can't we do it based > on individual table's path key. > > Example: > > create table t (a int not null, b int not null, c int, d text); > create unique index i_t_pkey on t(a, b); > insert into t (select a % 10, a / 10, a, 't' from generate_series(0, > 100000) a); > analyze; > > create table t1 (a int not null, b int not null, c int, d text); > create unique index i_t1_pkey_1 on t1(a); > create unique index i_t1_pkey_2 on t1(a, b); > insert into t1 (select a * 2, a / 10, a, 't' from generate_series(0, > 100000) a); > explain (costs off, analyze off) select * from t,t1 where t.a=t1.a order by > t1.a,t1.b,t1.c,t1.d; > > QUERY PLAN > ------------------------------------------ > Merge Join > Merge Cond: (t.a = t1.a) > -> Index Scan using i_t_pkey on t > -> Index Scan using i_t1_pkey_1 on t1 > (4 rows) > > Here we can notice that there is no separate sort key in plan. Sure, > Now drop the i_t1_pkey_1 and check the query plan again. > > drop index i_t1_pkey_1; > explain (costs off, analyze off) select * from t,t1 where t.a=t1.a order by > t1.a,t1.b,t1.c,t1.d; > QUERY PLAN > ------------------------------------------------ > Sort > Sort Key: t.a, t1.b, t1.c, t1.d > -> Merge Join > Merge Cond: (t.a = t1.a) > -> Index Scan using i_t_pkey on t > -> Index Scan using i_t1_pkey_2 on t1 > (6 rows) > > Can't above plan eliminate Sort Key even after dropping index > (i_t1_pkey_1)? My patch doesn't so since there no longer a 'common primary pathkeys' in this query. Perhaps the query doesn't allow the sort eliminated. Since a is no more a pkey, t1 can have dulicate rows for the same a, so the joined relation also may have duplicte values in the column a. Therefore the joined relation may be half sorted only by the column a so the sort pathkeys cannot be trimmed. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: