Re: BUG #5469: regexp_matches() has poor behaviour and more poor documentation - Mailing list pgsql-bugs
From | Robert Haas |
---|---|
Subject | Re: BUG #5469: regexp_matches() has poor behaviour and more poor documentation |
Date | |
Msg-id | AANLkTilIkP5cqXLiVahSCOYuyU6gMu8Bdcany2yHtzOH@mail.gmail.com Whole thread Raw |
In response to | BUG #5469: regexp_matches() has poor behaviour and more poor documentation ("Daniele Varrazzo" <daniele.varrazzo@gmail.com>) |
Responses |
Re: BUG #5469: regexp_matches() has poor behaviour and more
poor documentation
Re: BUG #5469: regexp_matches() has poor behaviour and more poor documentation |
List | pgsql-bugs |
On Mon, May 24, 2010 at 9:16 AM, Daniele Varrazzo <daniele.varrazzo@gmail.com> wrote: > regexp_matches() has been recently discussed > (http://archives.postgresql.org/pgsql-bugs/2010-04/msg00026.php): it is a > setof function and as such it can drop results. > > Unfortunately it is an useful function to newcomers who use SQL, use regexps > but don't arrive to read the chapter 34.4.7. (i.e. Extending SQL -> Query > Language (SQL) Functions -> SQL Functions Returning Sets) and are not so > enlightened to know that "setof text[]" means "if it doesn't match, it drops > the record". They just expect the function to be a LIKE on steroids. > > Please describe the behavior in the documentation of the function (i.e. > table 9-6. and section 9.7.3), possibly provide a function with a saner > interface, i.e. returning a text[] of the first match or NULL on no match, > or document a workaround (suitable for an user knowing regexps but not > setof-returning functions) to make the function not dropping record (e.g. I > fixed the "bug" adding a "|" at the end of the pattern, so that the function > returns an array of NULL in case of no match: I don't think it is a trivial > workaround). I'm not sure that it's very productive to refer to the behavior of our code as insane. We do document this in section 9.7.3, pretty explicitly: "The regexp_matches function returns all of the captured substrings resulting from matching a POSIX regular expression pattern. It has the syntax regexp_matches(string, pattern [, flags ]). If there is no match to the pattern, the function returns no rows. If there is a match, the function returns a text array whose n'th element is the substring matching the n'th parenthesized subexpression of the pattern (not counting "non-capturing" parentheses; see below for details)." I think that's pretty clear. Your mileage may vary, of course. I'm less confident than what we have in table 9-6 (other string functions, in section 9.4, string functions and operators) is clear on first reading, but neither do I immediately know how to improve it. Perhaps instead of critiquing our insanity you could provide some specific suggestions for improvement. Similarly, if you think we should have another function besides regexp_matches(), rather than just complaining that we don't, it would be more useful to suggest a name and a specific behavior and ideally maybe even provide a patch (or just the docs portion of a patch) - especially if you can point to a specific use-case that is hard to do with the SRF but would be easier with a function with a different interface. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
pgsql-bugs by date: