Re: making tsearch2 dictionaries - Mailing list pgsql-general
From | Oleg Bartunov |
---|---|
Subject | Re: making tsearch2 dictionaries |
Date | |
Msg-id | Pine.GSO.4.58.0402171337160.3452@ra.sai.msu.su Whole thread Raw |
In response to | Re: making tsearch2 dictionaries (Ben <bench@silentmedia.com>) |
Responses |
Re: making tsearch2 dictionaries
|
List | pgsql-general |
On Mon, 16 Feb 2004, Ben wrote: > So I noticed. ;) The dictionary's working, and I'd be happy to expand > upon the documentation. Just point me at something to work on. > I think you may just write a paper "How I did custom dictionary for tsearch2". From what I've read I see your dictionary could be interesting to people especially if you describe the motivation and usage. Do you want '100' or 'hundred' will be fully equivalent ? So, if you search '100' you will find document with 'hundred'. Interesting, that you will find '123', because '123' will be 'one hundred twenty three'. > But, like I said, I really want to figure out a way to pipe the output > of my dictionary through the another dictionary. If I can't do that, it > doesn't seem as useful, because "100" (handled by my dictionary) and > "one hundred" (handled by en_stem) currently don't generate the same > ts_vector. What's the problem ? You may configure which dictionaries and in what order should be used for given type of token (pg_ts_cfgmap table). Aha, I got your problem: www=# select * from ts_debug('one hundred'); ts_name | tok_type | description | token | dict_name | tsvector -----------------+----------+-------------+---------+-----------+---------- default_russian | lword | Latin word | one | {en_stem} | 'one' default_russian | lword | Latin word | hundred | {en_stem} | 'hundr 'hundred' becames 'hundr'. You may use synonym dictionary which is rather simple ( see http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_Notes for details ). Once word is recognized by synonym dictionary it will not pass to next dictionary ! This is how tsearch2 is working with any dictionary. > > Once I figure out how to tweak the parser to parse things they way I > want, I can expand upon those docs too. Looks like I'm going to need to > reach waaaay back into my brain and dust off my flex knowledge for that, > though.... What do you want from parser ? > > On Mon, 2004-02-16 at 10:33, Oleg Bartunov wrote: > > btw, Ben, if you get you dictionary working, could you describe process > > of developing so other people will appreciate your work. This part of > > tsearch2 documentation is very weak. > > > > Oleg > > > > On Mon, 16 Feb 2004, Teodor Sigaev wrote: > > > > > > > > > > > Ben wrote: > > > > Thanks for the replies. Just to clarify what I was doing, quaicode > > > > looked something like: > > > > > > > > phrase = palloc(8); > > > > phrase = "foo\0bar\0"; > > > > res = palloc(3); > > > > res[0] = phrase[0]; > > > > res[1] = phrase[5]; > > > > res[2] = 0; > > > > > > > > That crashed. Once I changed it to: > > > > > > > > res = palloc(3); > > > > res[0] = palloc(4); > > > > res[0] = "foo\0"; > > > > res[1] = palloc(4); > > > > res[2] = "bar\0"; > > > > res[3] = 0; > > > > > > > > it worked. > > > > > > > :) > > > I hope you mean: > > > res = palloc(3); > > > res[0] = palloc(4); > > > memcpy(res[0] ,"foo", 4); > > > res[1] = palloc(4); > > > memcpy(res[1] ,"bar", 4); > > > res[2] = 0; > > > > > > Look at indexes of res. > > > > > > > > > > Regards, > > Oleg > > _____________________________________________________________ > > Oleg Bartunov, sci.researcher, hostmaster of AstroNet, > > Sternberg Astronomical Institute, Moscow University (Russia) > > Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ > > phone: +007(095)939-16-83, +007(095)939-23-83 > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
pgsql-general by date: