Re: not using my GIN index in JOIN expression - Mailing list pgsql-performance
From | Heikki Linnakangas |
---|---|
Subject | Re: not using my GIN index in JOIN expression |
Date | |
Msg-id | 530F498B.6010004@vmware.com Whole thread Raw |
In response to | not using my GIN index in JOIN expression (Jean-Max Reymond <jmreymond@free.fr>) |
Responses |
Re: not using my GIN index in JOIN expression
|
List | pgsql-performance |
On 02/27/2014 04:06 PM, Jean-Max Reymond wrote: > I am running the last version of PostgreSQL 9.3.3 > I have two tables detm and corm and a lot of datas in the column > cormdata of corm table (1.4 GB). > > I have a GIN index on cormdata: > CREATE INDEX ix_corm_fulltext_cormdata ON corm > USING gin (to_tsvector('french'::regconfig, cormdata)) > WHERE cormishtml IS FALSE AND length(cormdata) < 20000; > > select distinct b.detmmailid from corm b where > (to_tsvector('french',b.cormdata) @@ to_tsquery('mauritanie') and > b.cormishtml is false and length(b.cormdata) < 20000) > is very fast and use the GIN index. > > "HashAggregate (cost=2027.72..2031.00 rows=328 width=52)" > " -> Bitmap Heap Scan on corm b (cost=24.25..2026.35 rows=548 width=52)" > " Recheck Cond: ((to_tsvector('french'::regconfig, cormdata) @@ > to_tsquery('mauritanie'::text)) AND (cormishtml IS FALSE) AND > (length(cormdata) < 20000))" > " -> Bitmap Index Scan on ix_corm_fulltext_cormdata > (cost=0.00..24.11 rows=548 width=0)" > " Index Cond: (to_tsvector('french'::regconfig, cormdata) > @@ to_tsquery('mauritanie'::text))" > > > With a join an another table detm, GIN index is not used > > > explain select distinct a.detmmailid from detm a JOIN corm b on > a.detmmailid = b.detmmailid where ((to_tsvector('french',b.cormdata) @@ > to_tsquery('mauritanie') and b.cormishtml is false and > length(b.cormdata) < 20000) OR ( detmobjet ~* 'mauritanie' )) > > "HashAggregate (cost=172418.27..172423.98 rows=571 width=52)" > " -> Hash Join (cost=28514.92..172416.85 rows=571 width=52)" > " Hash Cond: (b.detmmailid = a.detmmailid)" > " Join Filter: (((to_tsvector('french'::regconfig, b.cormdata) @@ > to_tsquery('mauritanie'::text)) AND (b.cormishtml IS FALSE) AND > (length(b.cormdata) < 20000)) OR (a.detmobjet ~* 'mauritanie'::text))" > " -> Seq Scan on corm b (cost=0.00..44755.07 rows=449507 > width=689)" > " -> Hash (cost=19322.74..19322.74 rows=338574 width=94)" > " -> Seq Scan on detm a (cost=0.00..19322.74 rows=338574 > width=94)" > > > If I remove OR ( detmobjet ~* 'mauritanie' ) in the select, the GIN > index is used > explain select distinct a.detmmailid from detm a JOIN corm b on > a.detmmailid = b.detmmailid where ((to_tsvector('french',b.cormdata) @@ > to_tsquery('mauritanie') and b.cormishtml is false and > length(b.cormdata) < 20000)) > > "HashAggregate (cost=4295.69..4301.17 rows=548 width=52)" > " -> Nested Loop (cost=24.67..4294.32 rows=548 width=52)" > " -> Bitmap Heap Scan on corm b (cost=24.25..2026.35 rows=548 > width=52)" > " Recheck Cond: ((to_tsvector('french'::regconfig, > cormdata) @@ to_tsquery('mauritanie'::text)) AND (cormishtml IS FALSE) > AND (length(cormdata) < 20000))" > " -> Bitmap Index Scan on ix_corm_fulltext_cormdata > (cost=0.00..24.11 rows=548 width=0)" > " Index Cond: (to_tsvector('french'::regconfig, > cormdata) @@ to_tsquery('mauritanie'::text))" > " -> Index Only Scan using pkey_detm on detm a (cost=0.42..4.13 > rows=1 width=52)" > " Index Cond: (detmmailid = b.detmmailid)" > > How can i force the use of the GIN index ? > thanks for your tips, The problem with the OR detmobject ~* 'mauritanie' restriction is that the rows that match that condition cannot be found using the GIN index. I think you'd want the system to fetch all the rows that match the other condition using the GIN index, and do something else to find the other rows. The planner should be able to do that if you rewrite the query as a UNION: select a.detmmailid from detm a JOIN corm b on a.detmmailid = b.detmmailid where (to_tsvector('french',b.cormdata) @@ to_tsquery('mauritanie') and b.cormishtml is false and length(b.cormdata) < 20000) union select a.detmmailid from detm a JOIN corm b on a.detmmailid = b.detmmailid where detmobjet ~* 'mauritanie' Note that that will not return rows in 'detm' that have no matching rows in 'corm' table, even if they match the "detmobjet ~* 'mauritanie" condition. That's what your original query also did, but if that's not what you want, leave out the JOIN from the second part of the union. - Heikki
pgsql-performance by date: