Re: Flexible configuration for full-text search - Mailing list pgsql-hackers

From Teodor Sigaev
Subject Re: Flexible configuration for full-text search
Date
Msg-id 196303f9-b456-bd23-fcd7-f4bfe6119115@sigaev.ru
Whole thread Raw
In response to Re: Flexible configuration for full-text search  (Aleksandr Parfenov <a.parfenov@postgrespro.ru>)
Responses Re: Flexible configuration for full-text search
Re: Flexible configuration for full-text search
List pgsql-hackers
Some notices:

0) patch conflicts with last changes in gram.y, conflicts are trivial.

1) jsonb in catalog. I'm ok with it, any opinions?

2) pg_ts_config_map.h, "jsonb       mapdicts" isn't decorated with #ifdef 
CATALOG_VARLEN like other varlena columns in catalog. It it's right, pls, 
explain and add comment.

3) I see changes in pg_catalog, including drop column, change its type, change 
index, change function etc. Did you pay attention to pg_upgrade? I don't see it 
in patch.

4) Seems, it could work:
ALTER TEXT SEARCH CONFIGURATION russian
   ALTER MAPPING FOR asciiword, asciihword, hword_asciipart,
                                           word, hword, hword_part
         WITH english_stem union (russian_stem, simple);
                 ^^^^^^^^^^^^^^^^^^^^^ simple way instead of
WITH english_stem union (case russian_stem when match then keep else simple end);

4) Initial approach suggested to distinguish three state of dictionary result: 
null (unknown word), stopword and usual word. Now only two, we lost possibility 
to catch stopwords. One of way to use stopwrods is: let we have to identical fts 
configurations, except one skips stopwords and another doesn't. Second 
configuration is used for indexing, and first one for search by default. But if 
we can't  find anything ('to be or to be' - phrase contains stopwords only) 
then we can use second configuration. For now, we need to keep two variant of 
each dictionary - with and without stopwords. But if it's possible to 
distinguish stop and nonstop words in configuration then we don't need to have 
duplicated dictionaries.


Aleksandr Parfenov wrote:
> On Fri, 30 Mar 2018 14:43:30 +0000
> Aleksander Alekseev <a.alekseev@postgrespro.ru> wrote:
> 
>> The following review has been posted through the commitfest
>> application: make installcheck-world:  tested, passed
>> Implements feature:       tested, passed
>> Spec compliant:           tested, passed
>> Documentation:            tested, passed
>>
>> LGTM.
>>
>> The new status of this patch is: Ready for Committer
> 
> It seems that after d204ef6 (MERGE SQL Command) in master the patch
> doesn't apply due to a conflict in keywords lists (grammar and header).
> The new version of the patch without conflicts is attached.
> 

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/


pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: WIP: Covering + unique indexes.
Next
From: Alvaro Herrera
Date:
Subject: Re: [HACKERS] path toward faster partition pruning