Re: [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit - Mailing list pgsql-patches
From | Bruce Momjian |
---|---|
Subject | Re: [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit |
Date | |
Msg-id | 200803051553.m25Frct09843@momjian.us Whole thread Raw |
Responses |
Re: [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit
|
List | pgsql-patches |
Euler Taveira de Oliveira wrote: > Edwin Groothuis wrote: > > > Ouch. But... since very long words are already not indexed (is the length > > configurable anywhere because I don't mind setting it to 50 characters), I > > don't think that it should bomb out of this but print a similar warning like > > "String only partly indexed". > > > This is not a bug. I would say it's a limitation. Look at > src/include/tsearch/ts_type.h. You could decrease len in WordEntry to 9 > (512 characters) and increase pos to 22 (4 Mb). Don't forget to update > MAXSTRLEN and MAXSTRPOS accordingly. > > > I'm still trying to determine how big the message it failed on was... > > > Maybe we should change the "string is too long for tsvector" to "string > is too long (%ld bytes, max %ld bytes) for tsvector". Good idea. I have applied the following patch to report in the error message the string length and maximum, like we already do for long words: Old: test=> select repeat('a', 3000)::tsvector; ERROR: word is too long (3000 bytes, max 2046 bytes) New: test=> select repeat('a ', 3000000)::tsvector; ERROR: string is too long for tsvector (1048576 bytes, max 1048575 bytes) I did not backpatch this to 8.3 because it would require translation string updates. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: src/backend/tsearch/to_tsany.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/tsearch/to_tsany.c,v retrieving revision 1.8 diff -c -c -r1.8 to_tsany.c *** src/backend/tsearch/to_tsany.c 1 Jan 2008 19:45:52 -0000 1.8 --- src/backend/tsearch/to_tsany.c 5 Mar 2008 15:41:36 -0000 *************** *** 163,169 **** if (lenstr > MAXSTRPOS) ereport(ERROR, (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), ! errmsg("string is too long for tsvector"))); totallen = CALCDATASIZE(prs->curwords, lenstr); in = (TSVector) palloc0(totallen); --- 163,169 ---- if (lenstr > MAXSTRPOS) ereport(ERROR, (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), ! errmsg("string is too long for tsvector (%d bytes, max %d bytes)", lenstr, MAXSTRPOS))); totallen = CALCDATASIZE(prs->curwords, lenstr); in = (TSVector) palloc0(totallen); Index: src/backend/utils/adt/tsvector.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/utils/adt/tsvector.c,v retrieving revision 1.11 diff -c -c -r1.11 tsvector.c *** src/backend/utils/adt/tsvector.c 1 Jan 2008 19:45:53 -0000 1.11 --- src/backend/utils/adt/tsvector.c 5 Mar 2008 15:41:36 -0000 *************** *** 224,230 **** if (cur - tmpbuf > MAXSTRPOS) ereport(ERROR, (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), ! errmsg("string is too long for tsvector"))); /* * Enlarge buffers if needed --- 224,230 ---- if (cur - tmpbuf > MAXSTRPOS) ereport(ERROR, (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), ! errmsg("string is too long for tsvector (%d bytes, max %d bytes)", cur - tmpbuf, MAXSTRPOS))); /* * Enlarge buffers if needed *************** *** 273,279 **** if (buflen > MAXSTRPOS) ereport(ERROR, (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), ! errmsg("string is too long for tsvector"))); totallen = CALCDATASIZE(len, buflen); in = (TSVector) palloc0(totallen); --- 273,279 ---- if (buflen > MAXSTRPOS) ereport(ERROR, (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), ! errmsg("string is too long for tsvector (%d bytes, max %d bytes)", buflen, MAXSTRPOS))); totallen = CALCDATASIZE(len, buflen); in = (TSVector) palloc0(totallen); Index: src/backend/utils/adt/tsvector_op.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/utils/adt/tsvector_op.c,v retrieving revision 1.12 diff -c -c -r1.12 tsvector_op.c *** src/backend/utils/adt/tsvector_op.c 1 Jan 2008 19:45:53 -0000 1.12 --- src/backend/utils/adt/tsvector_op.c 5 Mar 2008 15:41:36 -0000 *************** *** 488,494 **** if (dataoff > MAXSTRPOS) ereport(ERROR, (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), ! errmsg("string is too long for tsvector"))); out->size = ptr - ARRPTR(out); SET_VARSIZE(out, CALCDATASIZE(out->size, dataoff)); --- 488,494 ---- if (dataoff > MAXSTRPOS) ereport(ERROR, (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), ! errmsg("string is too long for tsvector (%d bytes, max %d bytes)", dataoff, MAXSTRPOS))); out->size = ptr - ARRPTR(out); SET_VARSIZE(out, CALCDATASIZE(out->size, dataoff));
pgsql-patches by date: