Thread: Multicolumn index including tsvector.
Hi everyone,
In the documentation and past messages to this list, it sounds like the btree_gist and btree_gin modules included in contrib/ in 8.4 should give me the functionality I'm looking for, but I don't seem to be getting the behavior I want.
I've made an example table representing something like a simplified version of a web discussion board.
Table "public.example"
Column | Type | Modifiers
---------+----------+-----------
body | text |
vectors | tsvector |
user_id | bigint |
I've got btree_gin and btree_gist installed, so I can make a composite index on vectors and user_id (which is a bigint).
create index index_examples_gin on example using gist (user_id,vectors);
create index index_examples_gist on example using gist (user_id,vectors);
So what I'm expecting here is that it'll be able to use one of those composite indexes to satisfy both the user_id and the vectors constraints. That doesn't seem to be the case, based on this query plan:
explain analyze select body from example where user_id=1 and vectors @@ to_tsquery('simple', 'when') limit 10;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1786.87..10510.47 rows=10 width=5) (actual time=203.155..244.987 rows=10 loops=1)
-> Bitmap Heap Scan on example (cost=1786.87..38425.99 rows=42 width=5) (actual time=203.153..244.980 rows=10 loops=1)
Recheck Cond: (vectors @@ '''when'''::tsquery)
Filter: (user_id = 1)
-> Bitmap Index Scan on index_examples_gist (cost=0.00..1786.86 rows=26535 width=0) (actual time=186.711..186.711 rows=27477 loops=1)
Index Cond: (vectors @@ '''when'''::tsquery)
Total runtime: 245.062 ms
(7 rows)
So it seems to be using the index only to satisfy the tsquery part of the where clause, and then applying the user_id filter to the rows it fetches. Ideally, I'd want to see it using both columns as part of the index condition, and not using a filter at all. Is what I'm trying to achieve possible?
Thanks in advance.
Kris
In the documentation and past messages to this list, it sounds like the btree_gist and btree_gin modules included in contrib/ in 8.4 should give me the functionality I'm looking for, but I don't seem to be getting the behavior I want.
I've made an example table representing something like a simplified version of a web discussion board.
Table "public.example"
Column | Type | Modifiers
---------+----------+-----------
body | text |
vectors | tsvector |
user_id | bigint |
I've got btree_gin and btree_gist installed, so I can make a composite index on vectors and user_id (which is a bigint).
create index index_examples_gin on example using gist (user_id,vectors);
create index index_examples_gist on example using gist (user_id,vectors);
So what I'm expecting here is that it'll be able to use one of those composite indexes to satisfy both the user_id and the vectors constraints. That doesn't seem to be the case, based on this query plan:
explain analyze select body from example where user_id=1 and vectors @@ to_tsquery('simple', 'when') limit 10;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1786.87..10510.47 rows=10 width=5) (actual time=203.155..244.987 rows=10 loops=1)
-> Bitmap Heap Scan on example (cost=1786.87..38425.99 rows=42 width=5) (actual time=203.153..244.980 rows=10 loops=1)
Recheck Cond: (vectors @@ '''when'''::tsquery)
Filter: (user_id = 1)
-> Bitmap Index Scan on index_examples_gist (cost=0.00..1786.86 rows=26535 width=0) (actual time=186.711..186.711 rows=27477 loops=1)
Index Cond: (vectors @@ '''when'''::tsquery)
Total runtime: 245.062 ms
(7 rows)
So it seems to be using the index only to satisfy the tsquery part of the where clause, and then applying the user_id filter to the rows it fetches. Ideally, I'd want to see it using both columns as part of the index condition, and not using a filter at all. Is what I'm trying to achieve possible?
Thanks in advance.
Kris
Kris Gale <krisgale@gmail.com> writes: > So what I'm expecting here is that it'll be able to use one of those > composite indexes to satisfy both the user_id and the vectors constraints. > That doesn't seem to be the case, based on this query plan: > explain analyze select body from example where user_id=1 and vectors @@ > to_tsquery('simple', 'when') limit 10; Try coercing the '1' to a bigint. I don't believe the btree_gist opclasses have any support for cross-type operators. regards, tom lane
> explain analyze select body from example where user_id=1 and vectors @@
> to_tsquery('simple', 'when') limit 10;Try coercing the '1' to a bigint. I don't believe the btree_gist
opclasses have any support for cross-type operators.
Perfect! Thanks, Tom. The query plan now shows it considering both columns in the index condition.
Kris
On 24 Nov 2009, at 24:08, Kris Gale wrote: > Table "public.example" > Column | Type | Modifiers > ---------+----------+----------- > body | text | > vectors | tsvector | > user_id | bigint | > > I've got btree_gin and btree_gist installed, so I can make a composite index on vectors and user_id (which is a bigint). A bigint for userid? How many billions of users do you expect to get? A bigint is 8 bytes or 64 bits. I think a normal int (32-bits) would be quite sufficient; it allows up to 2 billion users.It uses less storage space for both the data and the indices too. And as Tom already said casting can be a problem with bigints, not only if certain operators aren't defined for comparisonbetween int and bigint, but you'll also see a performance hit if table data you compare to needs to be upcastedto a bigint; joins come to mind. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,4b0bc36e11738279827033!