Thread: GiST indexing question
Hi, I'm attempting to expand an existing postgresql extension and I've run into a wall with the way operator classes should be defined for GiST indices. What I have that works is the following two operators: CREATE OPERATOR <@ ( LEFTARG = mol, RIGHTARG = mol, PROCEDURE = rsubstruct(mol, mol), COMMUTATOR = '@>', RESTRICT = contsel, JOIN = contjoinsel ); CREATE OPERATOR @> ( LEFTARG = mol, RIGHTARG = mol, PROCEDURE = substruct(mol, mol), COMMUTATOR = '<@', RESTRICT = contsel, JOIN = contjoinsel ); combined into an operator class for use in indexing : CREATE OPERATOR CLASS mol_ops DEFAULT FOR TYPE mol USING gist AS OPERATOR 3 @> (mol, mol), OPERATOR 4 <@ (mol, mol), FUNCTION 1 gmol_consistent (bytea, internal, int4), FUNCTION 2 gmol_union (bytea, internal), FUNCTION 3 gmol_compress (internal), FUNCTION 4 gmol_decompress (internal), FUNCTION 5 gmol_penalty (internal, internal, internal), FUNCTION 6 gmol_picksplit (internal, internal), FUNCTION 7 gmol_same (bytea, bytea, internal), STORAGE bytea; I'm now trying to add an equality operator (==) as follows: CREATE OPERATOR == ( LEFTARG = mol, RIGHTARG = mol, PROCEDURE = mol_eq, COMMUTATOR = '=', NEGATOR = '<>', RESTRICT = eqsel, JOIN = eqjoinsel ); and I want to use this to extend the operator class: CREATE OPERATOR CLASS mol_ops DEFAULT FOR TYPE mol USING gist AS OPERATOR 3 @> (mol, mol), OPERATOR 4 <@ (mol, mol), OPERATOR 6 == (mol, mol), FUNCTION 1 gmol_consistent (bytea, internal, int4), FUNCTION 2 gmol_union (bytea, internal), FUNCTION 3 gmol_compress (internal), FUNCTION 4 gmol_decompress (internal), FUNCTION 5 gmol_penalty (internal, internal, internal), FUNCTION 6 gmol_picksplit (internal, internal), FUNCTION 7 gmol_same (bytea, bytea, internal), STORAGE bytea; I made something of a guess as to which strategy I should use. I can now do basic tests with my == operator: moltest=# select 'c1nnccc1'::mol=='c1ccnnc1'::mol; ?column? ---------- t (1 row) moltest=# select 'c1nnccc1'::mol=='c1cnnnc1'::mol; ?column? ---------- f (1 row) (yes, those are both correct). I can create an index using these definitions: moltest=# create index molidx on mols using gist(m); CREATE INDEX And as far as I can tell it looks like the index should be working: moltest=# explain select count(*) from mols where m=='c1ncnnc1'; QUERY PLAN ------------------------------------------------------------------------- Aggregate (cost=8.27..8.28 rows=1 width=0) -> Index Scan using molidx on mols (cost=0.00..8.27 rows=1 width=0) Index Cond: (m == 'c1cnncn1'::mol) (3 rows) But when I actually run a query I get an error: moltest=# select count(*) from mols where m=='c1ncnnc1'; ERROR: Unknown strategy: 6 The index isn't totally screwed up though, because an @> query still works: moltest=# select count(*) from mols where m@>'c1ncnnc1'; count ------- 5 (1 row) I guess I must be doing something stupid, but neither the docs nor the internet have been particularly helpful in letting me know what. These tests are all being done using postgresql 9.0.1 built on an ubuntu 10.10 box. In case it's helpful, the full extension code (without the non-functioning attempts at adding == to the index) is here: http://rdkit.svn.sourceforge.net/viewvc/rdkit/trunk/Code/PgSQL/rdkit/ Thanks in advance for any help, -greg
On 2010-12-11 06:09, Greg Landrum wrote: > I'm now trying to add an equality operator (==) as follows: > > CREATE OPERATOR == ( > LEFTARG = mol, > RIGHTARG = mol, > PROCEDURE = mol_eq, > COMMUTATOR = '=', > NEGATOR = '<>', > RESTRICT = eqsel, > JOIN = eqjoinsel > ); Are you sure the commutator and negator exist and are correct? > and I want to use this to extend the operator class: > > CREATE OPERATOR CLASS mol_ops > DEFAULT FOR TYPE mol USING gist > AS > OPERATOR 3 @> (mol, mol), > OPERATOR 4 <@ (mol, mol), > OPERATOR 6 == (mol, mol), > FUNCTION 1 gmol_consistent (bytea, internal, int4), > FUNCTION 2 gmol_union (bytea, internal), > FUNCTION 3 gmol_compress (internal), > FUNCTION 4 gmol_decompress (internal), > FUNCTION 5 gmol_penalty (internal, internal, internal), > FUNCTION 6 gmol_picksplit (internal, internal), > FUNCTION 7 gmol_same (bytea, bytea, internal), > STORAGE bytea; This seems to be ok. > But when I actually run a query I get an error: > > moltest=# select count(*) from mols where m=='c1ncnnc1'; > ERROR: Unknown strategy: 6 I could not find that error exact message in the code, but something similar gave a hit. $ grep -r 'Unknown strategy' * $ grep -ri 'Unknown strategy' * contrib/intarray/_int_gin.c: elog(ERROR, "ginint4_consistent: unknown strategy number: %d", src/backend/access/gin/ginarrayproc.c: elog(ERROR, "ginarrayconsistent: unknown strategy number: %d", src/backend/access/gist/gistproc.c: elog(ERROR, "unknown strategy number: %d", strategy); src/backend/access/gist/gistproc.c: elog(ERROR, "unknown strategy number: %d", strategy); both the gistproc hits are in built-in consistent functions. This makes me believe that the error might be thrown in the rdkit's mol consistent function. > I guess I must be doing something stupid, but neither the docs nor the > internet have been particularly helpful in letting me know what. Yeah, a hint might help for that error message. But since it doesn't, a quick way for problems like this is to increase log_error_verbosity to verbose, trigger the error again. Then the sourcefile and linenumber are logged as well. If it is then still unclear what's causing the error, you can use that information in gdb to set a breakpoint on that source location. > In case it's helpful, the full extension code (without the > non-functioning attempts at adding == to the index) is here: > http://rdkit.svn.sourceforge.net/viewvc/rdkit/trunk/Code/PgSQL/rdkit/ If you take a look at http://rdkit.svn.sourceforge.net/viewvc/rdkit/trunk/Code/PgSQL/rdkit/rdkit_gist.c?revision=1481&view=markup and then line 651: that's the error being thrown. You need to extend gmol_consistent to handle the strategy 6 number. regards, Yeb Havinga
Dear Yeb, Thanks for the answer. On Sat, Dec 11, 2010 at 3:10 PM, Yeb Havinga <yebhavinga@gmail.com> wrote: > On 2010-12-11 06:09, Greg Landrum wrote: >> I guess I must be doing something stupid, but neither the docs nor the >> internet have been particularly helpful in letting me know what. > > Yeah, a hint might help for that error message. But since it doesn't, a > quick way for problems like this is to increase log_error_verbosity to > verbose, trigger the error again. Then the sourcefile and linenumber are > logged as well. If it is then still unclear what's causing the error, you > can use that information in gdb to set a breakpoint on that source location. >> >> In case it's helpful, the full extension code (without the >> non-functioning attempts at adding == to the index) is here: >> http://rdkit.svn.sourceforge.net/viewvc/rdkit/trunk/Code/PgSQL/rdkit/ > > If you take a look at > http://rdkit.svn.sourceforge.net/viewvc/rdkit/trunk/Code/PgSQL/rdkit/rdkit_gist.c?revision=1481&view=markup > and then line 651: that's the error being thrown. You need to extend > gmol_consistent to handle the strategy 6 number. Doh! That was it. No big wonder I couldn't find anything helpful on the 'net. Now I'm really glad I included the link to the code. Thanks again for both the answer and for providing a description of how to track down future problems. Best Regards, -greg