Thread: C function to create tsquery not working
I'm still having trouble making this work: http://pgsql.privatepaste.com/14a6d3075e CREATE OR REPLACE FUNCTION tsvector_to_tsquery(IN tsv tsvector, op IN char(1), weights IN varchar(4), maxpos IN smallint ) RETURNS tsquery AS 'MODULE_PATHNAME' LANGUAGE C STRICT; What I expect is: tsvector_to_tsquery('java tano', '&', 'ABCD', 100) -> java & tano tsvector_to_tsquery('java:1A,2B tano:3C,4D', '&', 'ABC', 100) -> java:A & java:B & tano:C tsvector_to_tsquery('java:1A,2B tano:3C,4D', '|', 'ABC', 100) -> java:AB | tano:C I've made some improvement compared to previous version I've posted but still it returns an empty tsquery. Things that works: - tsvector_tsquery_size returns reasonable total length of strings and total number of (operand + operator) - curout is actually filled with a lexeme - filters (wf, posmax) work -- Ivan Sergio Borgonovo http://www.webthatworks.it
On Thu, 11 Feb 2010 20:11:54 +0100 Ivan Sergio Borgonovo <mail@webthatworks.it> wrote: > I'm still having trouble making this work: > http://pgsql.privatepaste.com/14a6d3075e Finally I got it working, not the above version anyway... CREATE OR REPLACE FUNCTION tsvector_to_tsquery(IN tsv tsvector, op IN char(1), weights IN varchar(4), maxpos IN smallint ) RETURNS tsquery AS 'MODULE_PATHNAME' LANGUAGE C STRICT; There were some small errors, but the main one was setting SET_VARSIZE passing the pointer to the query in spite of the query. I'll need some smaller help to polish everything. It is a small work but there was someone on the list that showed some interest and it may be a nice simple helper for tsearch. What would be the right place to advertise it and make it available? To sum it up... I wrote 2 functions: 1 takes a tsvector and return it as a setof record text, int[], int[] 2 takes a tsvector, filter it according to weights and maximum position and return a | or & tsquery The first is just for "debugging" or to be able to build more complicated tsqueries in your preferred language. The second can come handy to look for text similarity skipping to compute tsvectors twice. create or replace function similar(_id int, out id int, out title text) returns setof record as $$ declare tsvin tsvector; tsq tsquery; begin select into tsvin from table where id = _id; tsq := tsvector_to_tsquery( tsvin, '|', 'AB', 100); return query select t.id, t.title from table t where t.tsv @@ tsq ; return; end; $$ language plpgsql stable; -- Ivan Sergio Borgonovo http://www.webthatworks.it
On Thu, 11 Feb 2010 20:11:54 +0100 Ivan Sergio Borgonovo <mail@webthatworks.it> wrote: > I'm still having trouble making this work: > > http://pgsql.privatepaste.com/14a6d3075e I tried to play with item->operator.left to see if reshuffling the expression could make any difference. item->operator.left = 2 * lexeme - 2 (1 + i) or item->operator.left = lexemes But the result seems pretty indifferent to what I put in operator.left. That makes me think the error is here. But I still get those 2 kind of error: ERROR: unrecognized operator type: 50 (first run) or ERROR: stack depth limit exceeded Just at the 3rd returned row, just for certain queries (see previous email). It doesn't look as if I palloced too few memory, I tried to allocate 3x the memory I estimated and I still get the errors. The function is actually returning correct results, so it seems the tsquery object is well formed. But still it looks like infix() is trying to read more operators than the one I thought I've put in... but just for certain queries, and just at the 3rd row returned. Should I use something different than palloc? Should I return the query differently? Am I supposed to free something? -- Ivan Sergio Borgonovo http://www.webthatworks.it
2010/2/25 Ivan Sergio Borgonovo <mail@webthatworks.it>: > On Thu, 11 Feb 2010 20:11:54 +0100 > Ivan Sergio Borgonovo <mail@webthatworks.it> wrote: > >> I'm still having trouble making this work: >> >> http://pgsql.privatepaste.com/14a6d3075e > > I tried to play with > item->operator.left > to see if reshuffling the expression could make any difference. > item->operator.left = 2 * lexeme - 2 (1 + i) > or > item->operator.left = lexemes > > But the result seems pretty indifferent to what I put in > operator.left. > That makes me think the error is here. > > But I still get those 2 kind of error: > ERROR: unrecognized operator type: 50 (first run) > or > ERROR: stack depth limit exceeded > > Just at the 3rd returned row, just for certain queries (see previous > email). > > It doesn't look as if I palloced too few memory, I tried to allocate > 3x the memory I estimated and I still get the errors. > > The function is actually returning correct results, so it seems the > tsquery object is well formed. > > But still it looks like infix() is trying to read more operators > than the one I thought I've put in... but just for certain queries, > and just at the 3rd row returned. > > Should I use something different than palloc? Should I return the > query differently? Am I supposed to free something? use --enable-assert configure flag? you can use memory in bad context. So you are alloc good memory, but when you leave function, then complete memory context is freeed and you can have a problem. Regards Pavel Stehule > > > -- > Ivan Sergio Borgonovo > http://www.webthatworks.it > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
On Thu, 25 Feb 2010 11:41:58 +0100 Pavel Stehule <pavel.stehule@gmail.com> wrote: > use --enable-assert configure flag? > > you can use memory in bad context. So you are alloc good memory, > but when you leave function, then complete memory context is > freeed and you can have a problem. Meanwhile I experienced some new strange behaviour. I created a table in the same DB containing some tsvector to test the function on a better known, easier to control set of data. The tsvectors contained aren't that different from the one contained in the "real" table, they are just fewer. I finally downloaded all pg source, compiled it, compiled my extension inside contrib. Restored the "real" DB. Tested on the "real" table... and no problem at all. Nothing really helpful since the 2 setup aren't really identical, one was hand compiled on sid, the other is stock debian lenny install. I'll try to compile the debian lenny version in a new virtual machine. Meanwhile if someone could give a glimpse to the source it would be really appreciated. http://www.webthatworks.it/d1/files/ts_utilities.tar.bz2 -- Ivan Sergio Borgonovo http://www.webthatworks.it