Re: TSearch queries with multiple languages - Mailing list pgsql-general

From Oleg Bartunov
Subject Re: TSearch queries with multiple languages
Date
Msg-id Pine.LNX.4.64.0902130948510.1247@sn.sai.msu.ru
Whole thread Raw
In response to Re: TSearch queries with multiple languages  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: TSearch queries with multiple languages
List pgsql-general
On Thu, 12 Feb 2009, Tom Lane wrote:

> Gordon Callan <gordon_callan@hotmail.com> writes:
>> Next we create an index on the ts_vector column:
>>  CREATE INDEX node_ts_body on node USING gin(ts_body);
>
>> From the documentation, it seems this index will know what config each row has.
>
> No, actually the index doesn't know and doesn't care.  The tsvector
> representation is language-independent --- it contains "just strings".
> All the language-dependent processing happens during reduction of the
> document text to tsvector (or reduction of a search string to tsquery).
> So if words from different languages happen to reduce to the same
> string, searches in both languages will find that entry.
>
> Usually this works the way people want; but if not, you could add an
> additional WHERE condition to your queries to match only documents in
> the desired language.

contrib/btree_gin, which is under review for 8.4, will allow to create
composite index like (ts_config, tsvector), so queries which specified
ts_config (language) will uses this index.


     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

pgsql-general by date:

Previous
From: Octavio Alvarez
Date:
Subject: Re: R: R: How to check if 2 series of data are equal
Next
From: "Paolo Saudin"
Date:
Subject: R: R: R: How to check if 2 series of data are equal