Home > mailing lists

Use of "token" vs "lexeme" in text search documentation - Mailing list pgsql-docs

From	Tom Lane
Subject	Use of "token" vs "lexeme" in text search documentation
Date	October 15, 2007 16:22:22
Msg-id	20167.1192476132@sss.pgh.pa.us Whole thread Raw
Responses	Re: Use of "token" vs "lexeme" in text search documentation
List	pgsql-docs

Tree view

The current documentation seems a bit inconsistent in its use of the
terms "token" and "lexeme".  The majority of the text seems to use
"lexeme" exclusively, which is inconsistent with the fact that the
term "token" is exposed by ts_token_type() and friends.  But there
are a few places that seem to use "lexeme" to mean something returned
by a dictionary.

I was considering trying to adopt these conventions:

* What a parser returns is a "token".

* When a dictionary recognizes a token, what it returns is a "lexeme".

This would make the phrase "normalized lexeme" redundant, since we
don't call it a lexeme at all unless it's been normalized.

Comments?

            regards, tom lane

pgsql-docs by date:

From: Albert Cervera i Areny
Date: 14 October 2007, 20:49:25
Subject: Re: Tips needed for contrib doc

From: Bruce Momjian
Date: 15 October 2007, 23:50:42
Subject: Slony for upgrades

Use of "token" vs "lexeme" in text search documentation - Mailing list pgsql-docs

Previous

Next