Proposal: Remove "no" from the default english.stop word list - Mailing list pgsql-hackers

From Peter Marreck
Subject Proposal: Remove "no" from the default english.stop word list
Date
Msg-id CAC3UHA0Vc36CRXSk8k5t7C+9MfEL33rSvC4u=tjMnpJ4iDCx2g@mail.gmail.com
Whole thread Raw
List pgsql-hackers
I recently ran into an issue where (after implementing fulltext search on my site) a user searching real estate listings for "no pets" also got results for "pets OK"! This was obviously a problem. After investigating, it seems the word "no" is considered a stopword by default (it's in the english.stop word list), and is therefore not indexed. I am here to propose that this is wrong based on the following reasons:

1) The word "yes" is not also included in this stopword list, a bizarre omission if the reason "no" was included was due to lack of significance (although I would recommend omitting both and arguing that both are significant)
2) The word "no" IS significant as a qualifier (such as, in my case, "no pets", or more usefully, "no<->pets" if using to_tsquery instead of plainto_tsquery), especially on boolean-like data that is brought into fulltext search scope (so for example, if some attribute "balcony" is false/not checked, you could index that as "no balcony" which then makes both the presence AND the absence of it searchable...)

That's basically it. Thoughts?

-Peter

pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Native partitioning tablespace inheritance
Next
From: Keith Fiske
Date:
Subject: Re: Native partitioning tablespace inheritance