tsearch patch and namespace pollution - Mailing list pgsql-hackers
| From | Tom Lane |
|---|---|
| Subject | tsearch patch and namespace pollution |
| Date | |
| Msg-id | 25419.1187312966@sss.pgh.pa.us Whole thread Raw |
| Responses |
Re: tsearch patch and namespace pollution
|
| List | pgsql-hackers |
I find the following additions to pg_proc in the current tsearch2 patch:
proc | prorettype
------------------------------------------+------------pg_ts_parser_is_visible(oid) |
booleanpg_ts_dict_is_visible(oid) | booleanpg_ts_template_is_visible(oid) |
booleanpg_ts_config_is_visible(oid) | booleantsvectorin(cstring) |
tsvectortsvectorout(tsvector) | cstringtsvectorsend(tsvector) |
byteatsqueryin(cstring) | tsquerytsqueryout(tsquery) |
cstringtsquerysend(tsquery) | byteagtsvectorin(cstring) |
gtsvectorgtsvectorout(gtsvector) | cstringtsvector_lt(tsvector,tsvector) |
booleantsvector_le(tsvector,tsvector) | booleantsvector_eq(tsvector,tsvector) |
booleantsvector_ne(tsvector,tsvector) | booleantsvector_ge(tsvector,tsvector) |
booleantsvector_gt(tsvector,tsvector) | booleantsvector_cmp(tsvector,tsvector) |
integerlength(tsvector) | integerstrip(tsvector) |
tsvectorsetweight(tsvector,"char") | tsvectortsvector_concat(tsvector,tsvector) |
tsvectorvq_exec(tsvector,tsquery) | booleanqv_exec(tsquery,tsvector) |
booleantt_exec(text,text) | booleanct_exec(character varying,text) |
booleantq_exec(text,tsquery) | booleancq_exec(character varying,tsquery) |
booleantsquery_lt(tsquery,tsquery) | booleantsquery_le(tsquery,tsquery) |
booleantsquery_eq(tsquery,tsquery) | booleantsquery_ne(tsquery,tsquery) |
booleantsquery_ge(tsquery,tsquery) | booleantsquery_gt(tsquery,tsquery) |
booleantsquery_cmp(tsquery,tsquery) | integertsquery_and(tsquery,tsquery) |
tsquerytsquery_or(tsquery,tsquery) | tsquerytsquery_not(tsquery) |
tsquerytsq_mcontains(tsquery,tsquery) | booleantsq_mcontained(tsquery,tsquery) |
booleannumnode(tsquery) | integerquerytree(tsquery) |
textrewrite(tsquery,tsquery,tsquery) | tsqueryrewrite(tsquery,text) |
tsqueryrewrite_accum(tsquery,tsquery[]) | tsqueryrewrite_finish(tsquery) |
tsqueryrewrite(tsquery[]) | tsquerystat(text) |
recordstat(text,text) | recordrank(real[],tsvector,tsquery,integer) |
realrank(real[],tsvector,tsquery) | realrank(tsvector,tsquery,integer) | realrank(tsvector,tsquery)
| realrank_cd(real[],tsvector,tsquery,integer) | realrank_cd(real[],tsvector,tsquery) |
realrank_cd(tsvector,tsquery,integer) | realrank_cd(tsvector,tsquery) | realtoken_type(oid)
| recordtoken_type(text) | recordparse(oid,text) |
recordparse(text,text) | recordlexize(oid,text) |
text[]lexize(text,text) | text[]headline(oid,text,tsquery,text) |
textheadline(oid,text,tsquery) | textheadline(text,text,tsquery,text) |
textheadline(text,text,tsquery) | textheadline(text,tsquery,text) | textheadline(text,tsquery)
| textto_tsvector(oid,text) | tsvectorto_tsvector(text,text) |
tsvectorto_tsquery(oid,text) | tsqueryto_tsquery(text,text) |
tsqueryplainto_tsquery(oid,text) | tsqueryplainto_tsquery(text,text) |
tsqueryto_tsvector(text) | tsvectorto_tsquery(text) |
tsqueryplainto_tsquery(text) | tsquerytsvector_update_trigger() |
triggerget_ts_config_oid(text) | oidget_current_ts_config() | oid
(82 rows)
(This list omits functions with INTERNAL arguments, as those are of
no particular concern to users.)
While most of these are probably OK, I'm disturbed by the prospect
that we are commandeering names as generic as "parse" or "stat"
with argument types as generic as "text". I think we need to put
a "ts_" prefix on some of these. Specifically, I find these names
totally unacceptable without a ts_ prefix:
stat(text) | recordstat(text,text) | record
token_type(oid) | recordtoken_type(text) | record
parse(oid,text) | recordparse(text,text) | record
lexize(oid,text) | text[]lexize(text,text) | text[]
These guys might be all right given that some of their arguments are
tsvector or tsquery, but it's not completely convincing --- think about
the case where an argument is given as an undecorated literal string.
It's also not all that clear that they are related to text searching.
I'm for putting a ts_ prefix on them too:
rank(real[],tsvector,tsquery,integer) | realrank(real[],tsvector,tsquery) |
realrank(tsvector,tsquery,integer) | realrank(tsvector,tsquery) |
realrank_cd(real[],tsvector,tsquery,integer)| realrank_cd(real[],tsvector,tsquery) |
realrank_cd(tsvector,tsquery,integer) | realrank_cd(tsvector,tsquery) | real
rewrite(tsquery,tsquery,tsquery) | tsqueryrewrite(tsquery,text) |
tsqueryrewrite_accum(tsquery,tsquery[]) | tsqueryrewrite_finish(tsquery) |
tsqueryrewrite(tsquery[]) | tsquery
headline(oid,text,tsquery,text) | textheadline(oid,text,tsquery) |
textheadline(text,text,tsquery,text) | textheadline(text,text,tsquery) |
textheadline(text,tsquery,text) | textheadline(text,tsquery) | text
These guys are just plain badly named, as it's completely unobvious that
they have anything to do with tsearch (or what they do at all, actually).
Furthermore the "varchar" variants seem entirely redundant with the
"text" ones:
vq_exec(tsvector,tsquery) | booleanqv_exec(tsquery,tsvector) | booleantt_exec(text,text)
| booleanct_exec(character varying,text) | booleantq_exec(text,tsquery)
|booleancq_exec(character varying,tsquery) | boolean
Comments, suggestions?
regards, tom lane
pgsql-hackers by date: