Re: questions about tsearch2 (for czech language) - Mailing list pgsql-general
From | Oleg Bartunov |
---|---|
Subject | Re: questions about tsearch2 (for czech language) |
Date | |
Msg-id | Pine.GSO.4.58.0312221401080.14104@ra.sai.msu.su Whole thread Raw |
In response to | questions about tsearch2 (for czech language) (Pavel Stehule <stehule@kix.fsv.cvut.cz>) |
Responses |
Re: questions about tsearch2 (for czech language)
|
List | pgsql-general |
On Mon, 22 Dec 2003, Pavel Stehule wrote: > Hello > > I try tsearch2 within czech environment. It is works fine, but I have two > questions. > > 1. I have words "se", "ve" in my czech stop words. But I get this words in > result. Why? Have I problem with my configuration? did you specify stop words in dictionaries configuration ? select * from pg_ts_dict; > > tsearch2=# select * from ts_debug('jmenuji se Pavel StЛhule a bydlМm ve > Skalici.'); > ts_name | tok_type | description | token | dict_name | tsvector > ---------------+----------+-------------+---------+-------------+----------- > default_czech | lword | Latin word | jmenuji | {cz_ispell} | > 'jmenuji' > default_czech | lword | Latin word | se | {cz_ispell} | 'se' > default_czech | lword | Latin word | Pavel | {cz_ispell} | 'pavel' > default_czech | word | Word | StЛhule | {cz_ispell} | > default_czech | lword | Latin word | a | {cz_ispell} | > default_czech | word | Word | bydlМm | {cz_ispell} | 'bydlet' > default_czech | lword | Latin word | ve | {cz_ispell} | 've' > default_czech | lword | Latin word | Skalici | {cz_ispell} | > 'skalici' > (8 ЬАdek) > > tsearch2=# select * from pg_ts_cfgmap where ts_name='default_czech'; > ts_name | tok_alias | dict_name > ---------------+--------------+------------- > default_czech | email | {simple} > default_czech | file | {simple} > default_czech | float | {simple} > default_czech | host | {simple} > default_czech | hword | {cz_ispell} > default_czech | int | {simple} > default_czech | lhword | {cz_ispell} > default_czech | lpart_hword | {cz_ispell} > default_czech | lword | {cz_ispell} > default_czech | nlhword | {cz_ispell} > default_czech | nlpart_hword | {cz_ispell} > default_czech | nlword | {cz_ispell} > default_czech | part_hword | {simple} > default_czech | sfloat | {simple} > default_czech | uint | {simple} > default_czech | uri | {simple} > default_czech | url | {simple} > default_czech | version | {simple} > default_czech | word | {cz_ispell} > (19 ЬАdek) > > 2. I use small czech dictionary. I need don't erase words which aren't in > dictionary (in my sample StЛhule). Can I set it somewhere? I tryed add > simple dict into cfg map, but witout sucess > Example, please ! What do you mean 'erase words' ? > tsearch2=# select * from ts_debug('jmenuji se Pavel StЛhule a bydlМm ve > Skalici.'); ts_name | tok_type | description | token | > dict_name | tsvector > ---------------+----------+-------------+---------+--------------------+----------- > default_czech | word | Word | StЛhule | {cz_ispell,simple} | > default_czech | lword | Latin word | a | {cz_ispell,simple} | > default_czech | word | Word | bydlМm | {cz_ispell,simple} | > 'bydlet' > > > Thank You > Pavel Stehule > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
pgsql-general by date: