Re: Using a german affix file for compound words - Mailing list pgsql-general
From | Artur Zakirov |
---|---|
Subject | Re: Using a german affix file for compound words |
Date | |
Msg-id | 56AB2F20.6030604@postgrespro.ru Whole thread Raw |
In response to | Re: Using a german affix file for compound words (Wolfgang Winkler <wolfgang.winkler@digital-concepts.com>) |
Responses |
Re: Using a german affix file for compound words
|
List | pgsql-general |
On 28.01.2016 20:36, Wolfgang Winkler wrote: > I'm using 9.4.5 as well and I used exactly the same iconv lines as you > postes below. > > Are there any encoding options that have to be set right? The database > encoding is set to UTF8. > > ww What output does the command show: -> SHOW LC_CTYPE; ? Did you try a dictionary from http://extensions.openoffice.org/en/project/german-de-de-frami-dictionaries ? You need extract from a downloaded archive de_DE_frami.aff and de_DE_frami.dic files, rename them and convert them to UTF-8. > > Am 2016-01-28 um 17:34 schrieb Artur Zakirov: >> On 28.01.2016 18:57, Oleg Bartunov wrote: >>> >>> >>> On Thu, Jan 28, 2016 at 6:04 PM, Wolfgang Winkler >>> <wolfgang.winkler@digital-concepts.com >>> <mailto:wolfgang.winkler@digital-concepts.com>> wrote: >>> >>> Hi! >>> >>> We have a problem with importing a compound dictionary file for >>> german. >>> >>> I downloaded the files here: >>> >>> http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/dicts/ispell/ispell-german-compound.tar.gz >>> >>> and converted them to utf-8 with iconv. The affix file seems ok when >>> opened with an editor. >>> >>> When I try to create or alter a dictionary to use this affix file, I >>> get the following error: >>> >>> alter TEXT SEARCH DICTIONARY german_ispell ( >>> DictFile = german, >>> AffFile = german, >>> StopWords = german >>> ); >>> ERROR: syntax error >>> CONTEXT: line 224 of configuration file >>> "/usr/local/pgsql/share/tsearch_data/german.affix": " ABE > >>> -ABE,äBIN >>> " >>> >>> This is the first occurrence of an umlaut character in the file. >>> I've found a view postings where the same file is used, e.g.: >>> >>> http://www.postgresql.org/message-id/flat/556C1411.4010608@tbz-pariv.de#556C1411.4010608@tbz-pariv.de >>> >>> This users has been able to import the file. Am I missing something >>> obvious? >>> >> >> What version of PostgreSQL do you use? >> >> I tested this dictionary on PostgreSQL 9.4.5. Downloaded from the link >> files and executed commands: >> >> iconv -f ISO-8859-1 -t UTF-8 german.aff -o german2.affix >> iconv -f ISO-8859-1 -t UTF-8 german.dict -o german2.dict >> >> I renamed them to german.affix and german.dict and moved to the >> tsearch_data directory. Executed commands without errors: >> >> -> create text search dictionary german_ispell ( >> Template = ispell, >> DictFile = german, >> AffFile = german, >> Stopwords = german >> ); >> DROP TEXT SEARCH DICTIONARY >> >> -> select ts_lexize('german_ispell', 'test'); >> ts_lexize >> ----------- >> {test} >> (1 row) >> > > > -- > > *Wolfgang Winkler* > Geschäftsführung > wolfgang.winkler@digital-concepts.com > mobil +43.699.19971172 > > dc:*büro* > digital concepts Novak Winkler OG > Software & Design > Landstraße 68, 5. Stock, 4020 Linz > www.digital-concepts.com <http://www.digital-concepts.com> > tel +43.732.997117.72 > tel +43.699.1997117.2 > > Firmenbuchnummer: 192003h > Firmenbuchgericht: Landesgericht Linz > > > -- Artur Zakirov Postgres Professional: http://www.postgrespro.com Russian Postgres Company
pgsql-general by date: