Hi,
I'm looking for a data type to store numerically labelled hierarchical data, such as section.subsection.paragraph
numbers(e.g. '1.5.3', '1.10.2', '6.30').
The closest data type that I have found is ltree. However, the collation order is inappropriate: it would put '1.10.2'
before'1.5.3', since it performs a naïve memcmp() at each level.[1]
One way to get the desired sort order would be to use the semver extension. However, that's not really appropriate, as
Idon't want to store version numbers, and my data do not fit server's mandatory X.Y.Z three-level scheme.
Of course, I could define a hierarchy-of-integers data type and implement my own comparison functions. I'm reluctant
tocause a proliferation of data types, though, as ltree is semantically the type I want. I'm just unhappy with its
sortorder.
Therefore, I would like to suggest that ltree be modified to use a smart comparator that recognizes numbers within
stringsand sorts them in a human-friendly way. Apple[2] and recent versions of Windows[3] handle filenames this way.
Onesample implementation of such a comparator is natsort.[4]
The performance impact of the enhanced comparator would probably be negligible, compared to I/O bottlenecks. A bigger
issuewould be backwards compatibility, especially for ltrees with existing btree indexes.
Feedback? Suggestions?
Derek
[1]: http://doxygen.postgresql.org/ltree__op_8c.html#a635600ad7aad78addf3c14a6e2d67fed
[2]:
https://developer.apple.com/LIBRARY/IOS/#documentation/FileManagement/Conceptual/FileSystemProgrammingGUide/FileSystemDetails/FileSystemDetails.html
[3]: http://www.codinghorror.com/blog/2007/12/sorting-for-humans-natural-sort-order.html
[4]: http://sourcefrog.net/projects/natsort/