Re: PATCH: CITEXT 2.0 - Mailing list pgsql-hackers
From | David E. Wheeler |
---|---|
Subject | Re: PATCH: CITEXT 2.0 |
Date | |
Msg-id | 55AA0795-AA11-4D44-B7F6-90B3A933365F@kineticode.com Whole thread Raw |
In response to | Re: PATCH: CITEXT 2.0 (Zdenek Kotala <Zdenek.Kotala@Sun.COM>) |
Responses |
Re: PATCH: CITEXT 2.0
Re: PATCH: CITEXT 2.0 |
List | pgsql-hackers |
On Jul 2, 2008, at 09:13, Zdenek Kotala wrote: > I went through your code and I have following comments/questions: Thanks. I'm on a short family vacation, so it will probably be Monday before I can submit a new patch. I got a few replies below, though. > 1) formating > > Please, do not type space before asterix: > char * lcstr, * rcstr -> char *lcstr, *rcstr > > and do not put extra space in a brackets > citextcmp( left, right ) -> citextcmp(left, right) Okay. > Other seems to me OK (remove 8.2 version mention at the bottom) Um, are you referring to this (at the top): +// PostgreSQL 8.2 Magic. +#ifdef PG_MODULE_MAGIC +PG_MODULE_MAGIC; +#endif That's gotta be there (though I can of course remove the comment). > 2) citextcmp function returns int but citext_cmp returns int32. I > think citextcmp should returns int32 as well. Yeah, I'm a bit confused by this. I followed what's in varlena.c on this. The comparison functions are documented supposed to return 1, 0, or -1, which of course would be covered by int. But varstr_cmp(), which ultimately returns the value, returns all kinds of numbers. So, perhaps someone could say what it's *supposed* to be? > 3) citext_larger, citext_smaller function have memory leak. You > don't call PG_FREE_IF_COPY on unused text. > > I'm not sure if return value in these function should be duplicated > or not. These I also duplicated from varlena.c, and I asked about a potential memory leak in a previous email. If there's a leak here, would there not also be a leak in varlena.c? > 4) Operator = citext_eq is not correct. See comment http://doxygen.postgresql.org/varlena_8c.html#8621d064d14f259c594e4df3c1a64cac So should citextcmp() call strncmp() instead of varst_cmp()? The latter is what I saw in varlena.c. > There must be difference between equality and collation for example > in Czech language 'láska' and 'laská' are different word it means > that 'láska' != 'laská'. But there is no difference in collation > order. See Unicode Universal Collation Algorithm for detail. I'll leave the collation stuff to the functions I call (*far* from my specialty), but I'll add a test for this and make sure it works as expected. Um, although, with what collation should it be tested? The tests I wrote assume en_US.UTF-8. > 5) There are several commented out lines in CREATE OPERATOR > statement mostly related to NEGATOR. Is there some reason for that? I copied it from the original citext.sql. Not sure what effect it has. > Also OPERATOR || has probably wrong negator. Right, good catch. > 6) You use pgTAP for unit test. It is something what should be > widely discussed. It is new approach and I'm not correct person to > say if it good or not. Sure. I created a pgFoundry project for it here: http://pgtap.projects.postgresql.org/ > I see there two problems: > At first point there is not clear licensing (https://svn.kineticode.com/pgtap/trunk/README.pgtap > ). That's a standard revised BSD license. > And, It should be implemented as a general framework by my opinion > not only as a part of citext contrib module. It is. I just put the file in citext so that the tests can use it. It's distributed on pgFoundry and maintained at the SVN URL quoted in the file. > Please, let me know when you will have updated code and I will start > deep testing. Will do. Best, David
pgsql-hackers by date: