Re: Fixing row comparison semantics - Mailing list pgsql-hackers
From | Martijn van Oosterhout |
---|---|
Subject | Re: Fixing row comparison semantics |
Date | |
Msg-id | 20051225131005.GA23081@svana.org Whole thread Raw |
In response to | Re: Fixing row comparison semantics (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Fixing row comparison semantics
|
List | pgsql-hackers |
On Sat, Dec 24, 2005 at 09:38:23AM -0500, Tom Lane wrote: > Are you suggesting that COLLATE will impose comparison semantics on > all datatypes including non-string types? If so, I'd be interested > to know what you have in mind. If not, claiming that it makes the > issue go away is nonsensical. Well, yes, on all data types. It needs to be done for string types and it would be nice for user-defined data types, so you may as well do it for all types. It avoids adding special cases, which is a good thing, IMHO. Every data type has at least two collations, ascending and descending. So instead of all the current stuff with reverse operator classes, you'll just be able to declare your index as: CREATE INDEX blah ON foo (a, b COLLATE DESC); And it'll be able to be used for queries using ORDER BY a, b DESC. String data types are just the obvious example of types that have many different collations and they do have the most possibilities. But I think that user-defined collations would be a powerful idea. All they need to do is create a btree operator class that describes the basic order and then they can use this as a collation anywhere they like. I hope you are not thinking of restricting collations to just string types, because the special cases would be dreadful. Doing it this way just means that most places dealing with order only need to worry about the collation eg pathkeys and not the implementation. Technically speaking, NULLS FIRST/LAST are also a form of collation but I'm not going to touch those until I can at least replicate current functionality, and they are not relevent for row comparisons anyway. Collations to operator classes are a many-to-one relationship. I can see situations where you would have 20 collations using a single operator class. Locale specific ordering is really just a subset of collation. At the moment it just uses the xlocale support present in glibc/MacOS X/Win32 but my hope is that it will be pluggable in the sense that you'll be able to say: CREATE LOCALE hungarian AS 'hu_HU' USING glibc; CREATE LOCALE serbian_us AS 'sr_Latn_YU_REVISED@currency=USD' USING icu; (The latter being: Serbian (Latin, Yugoslavia, Revised Orthography, Currency=US Dollar. Example taken from ICU website). Then you can use these in column declarations and have them automatically use that locale for comparisons. It isn't as hard as it looks but it does touch a lot of different places in the backend. If you want technical details I can do that too (the summary on pg-patches a while ago is now wildly out of date). Currently I'm trying to get up to speed on pathkeys and indexes before the tree drifts too far... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
pgsql-hackers by date: