Re: REVIEW Range Types - Mailing list pgsql-hackers
From | Erik Rijkers |
---|---|
Subject | Re: REVIEW Range Types |
Date | |
Msg-id | a0804f90179263648c96f89d481cd85e.squirrel@webmail.xs4all.nl Whole thread Raw |
In response to | Range Types (Jeff Davis <pgsql@j-davis.com>) |
Responses |
Re: REVIEW Range Types
|
List | pgsql-hackers |
On Sun, February 6, 2011 07:41, Jeff Davis wrote: > New patch. All known TODO items are closed, although I should do a > cleanup pass over the code and docs. > > Fixed in this patch: > > * Many documentation improvements > * Added INT8RANGE > * Renamed PERIOD[TZ] -> TS[TZ]RANGE > * Renamed INTRANGE -> INT4RANGE > * Improved parser's handling of whitespace and quotes > * Support for PL/pgSQL functions with ANYRANGE arguments/returns > * Make "subtype_float" function no longer a requirement for GiST, > but it should still be supplied for the penalty function to be > useful. > I'm afraid even the review is WIP, but I thought I'd post what I have. Context: At the moment we use postbio (see below) range functionality, to search ranges and overlap in large DNA databases ('genomics'). We would be happy if a core data type could replace that. It does look like the present patch is ready to do those same tasks, of which the main one for us is gist-indexed ranges. We also use btree_gist with that, so to include that in core would make sense in this regard. test config: ./ configure \ --prefix=/var/data1/pg_stuff/pg_installations/pgsql.range_types \ --with-pgport=6563 \ --enable-depend \ --enable-cassert\ --enable-debug \ --with-perl \ --with-openssl \ --with-libxml \ --enable-dtrace compile, make, check, install all OK. ------------------ Submission review: ------------------ * Is the patch in context diff format? Yes. * Does it apply cleanly to the current git master? It applied cleanly. (after the large serialisation commit 2011.02.08 it will need some changes/rebase) * Does it include reasonable tests, necessary doc patches, etc? Yes, there are many tests; the documentation is good. Small improvements below. ----------------- Usability review ----------------- Read what the patch is supposed to do, and consider: * Does the patch actually implement that? Yes. * Do we want that? Yes. * Do we already have it? contrib/seg has some similar functionalities: "seg is a data type for representing line segments, or floating point intervals". And on pgFoundry there is a seg spin-off "postbio", tailored to genomic data (with gist-indexing). (see postbio manual: http://postbio.projects.postgresql.org/ ) * Does it follow SQL spec, or the community-agreed behavior? I don't know - I couldn't find much in the SQL-spec on a range datatype. The ranges behaviour has been discussed on -hackers. * Does it include pg_dump support (if applicable)? dump/restore were fine in the handful of range-tables which I moved between machines. * Are there dangers? Not that I could find. * Have all the bases been covered? I think the functionality looks fairly complete. ------------- Feature test: ------------- * Does the feature work as advertised? The patch seems very stable. My focus has been mainly on the intranges. I tested by parsing documentation- and regression examples, and parametrising them in a perl harness, to generate many thousands of range combinations. I found only a single small problem (posted earlier - Jeff Davis solved it already apparently). see: http://archives.postgresql.org/pgsql-hackers/2011-02/msg00387.php * Are there corner cases the author has failed to consider? No. * Are there any assertion failures or crashes? I haven't seen a single one. -------------- Documentation: -------------- Section 9.18: table 9-42. range functions: The following functions are missing (I encountered them in the regression tests): contained_by() range_eq() section 'Constructing Ranges' (8.16.6): In the code example, remove the following line: "-- the int4range result will appearin the canonical format" it doesn't make sense there. At this place "canonical format" has not been discussed; maybe it is not even discussed anywhere. also (same place): 'where "_" is used to mean "exclusive" and "" is used to mean "inclusive".' should be: 'where "_" is used to mean "exclusive" and "i" is used to mean "inclusive".' And btw: it should mention here that the range*inf* functions, an underscore to separate 'range' from the rest of the function name, e.g.: range_linfi_() => infinite lower bound, inclusiveupper bound I still want to do Performance review and Coding review. FWIW, I would like to repeat that my impression is that the patch is very stable, especially with regard to the intranges (tested extensively). regards, Erik Rijkers
pgsql-hackers by date: