Re: PATCH: CITEXT 2.0 v3 - Mailing list pgsql-hackers
From | David E. Wheeler |
---|---|
Subject | Re: PATCH: CITEXT 2.0 v3 |
Date | |
Msg-id | A2AA4B32-A916-497C-9C2B-DC69B8BF952A@kineticode.com Whole thread Raw |
In response to | Re: PATCH: CITEXT 2.0 v3 (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: PATCH: CITEXT 2.0 v3
Re: PATCH: CITEXT 2.0 v3 Re: PATCH: CITEXT 2.0 v3 |
List | pgsql-hackers |
On Jul 12, 2008, at 14:57, Tom Lane wrote: > There was some discussion earlier about whether the proposed > regression > tests for citext are suitable for use in contrib or not. After > playing > with them for awhile, I have to come down very firmly on the side of > "not". I have these gripes: You're nothing if not thorough, Tom. > 1. The style is gratuitously different from every other regression > test > in the system. This is not a good thing. If it were an amazingly > better way to do things, then maybe, but as far as I can tell the > style > pgTAP forces on you is really pretty darn poorly suited for SQL tests. > You have to contort what could naturally be expressed in SQL as a > table > result into a scalar. Plus it's redundant with the expected-output > file. Sure. I wasn't aware of the existing regression test methodology when I wrote pgTAP and those tests. I'm fine to change them and use pgTAP for testing my app code, rather than PostgreSQL itself. > 2. It's ridiculously slow; at least a factor of ten slower than doing > equivalent tests directly in SQL. This is a very bad thing. Speed of > regression tests matters a lot to those of us who run them a dozen > times > per day --- and I do not wish to discourage any developers who don't > work that way from learning better habits ;-) Hrm. I'm wonder why it's so slow? The test functions don't really do a lot. Anyway, I agree that they should perform well. > Because of #1 and #2 I find the use of pgTAP to be a nonstarter. > I have a couple of gripes about the substance of the tests as well: > > 3. What you are mostly testing is not the behavior of citext itself, > but the behavior of the underlying strcoll function. This is pretty > much doomed to failure, first because the regression tests are > intended > to work in multiple locales (and we are *not* giving that up for > citext), and second because the behavior of strcoll isn't all that > portable across platforms even given allegedly similar locale settings > (as we already found out yesterday). Yes, I *just* ran the tests on Ubuntu and opened my mail to ask about the bizarre differences when I saw this message. Thanks for answering my question before I asked it. :-) > Sadly, I think you have to give up > attempts to check the interesting multibyte cases and confine yourself > to tests using ASCII strings. Grr. Kind of defeats the purpose. Is there no infrastructure for testing multibyte functionality? Are test database clusters all built using SQL_ASCII and the C locale? > 4. A lot of the later test cases are equally uselessly testing whether > piggybacking over text functions works. The odds of ever finding > anything with those tests are not distinguishable from zero (unless > the > underlying text function is busted, which is not your responsibility > to > test). So I don't see any point in putting them into the standard > regression package. (What maybe *would* be useful to test, but you > didn't, is whether the result of a function is considered citext > rather > than text.) Well, I was doing test-driven development: I wrote the tests to ensure that I had added the functions for CITEXT properly, and when they passed, I could move on. As a unit tester, it'd feel weird for me to drop these. I'm not testing the underlying functions; I'm making sure that they work properly with CITEXT. > I suggest something more like the attached as a suitable regression > test. I got bored about halfway through and started to skim, so there > might be a few tests toward the end of the original set that would be > worth transposing into this one, but nothing jumped out at me. Thanks! I'll work this in. Best, David PS: I confirmed late yesterday that the memory leak I saw was with my version for 8.3, where I had my own str_tolower(). It clearly has a leak that I'll have to fix, but it has no bearing on the contribution to 8.4, which has no such leak. Thanks for running the test yourself to confirm.
pgsql-hackers by date: