Re: Per-column collation, work in progress - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Per-column collation, work in progress
Date
Msg-id AANLkTimb3+_E7=3o9u6_GEV7V3w=FmRMroSgw60SNWrJ@mail.gmail.com
Whole thread Raw
In response to Re: Per-column collation, work in progress  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: Per-column collation, work in progress
List pgsql-hackers
On Sun, Sep 26, 2010 at 1:15 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> Is there any reason why you prohibit a different encodings per one
> database? Actually people expect from collate per column a possibility
> to store a two or more different encodings per one database.

These are two completely separate problems that only look related. The
main difference is that while collation is a property of the
comparison or sort you're performing encoding is actually a property
of the string itself. It doesn't make sense to specify a different
encoding than what the string actually contains.

You could actually do what you want now by using bytea columns and
convert_to/convert_from and it wouldn't be much easier if the support
were built into text since you would still have to keep track of the
encoding it's in and the encoding you want. We could have a
encoded_text data type which includes both the encoding and the string
and which any comparison function automatically handles conversion
based on the encoding of the collation requested -- but I wouldn't
want that to be the default text datatype. It would impose a lot of
overhead on the basic text operations and magnify the effects of
choosing the wrong collation.

-- 
greg


pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Per-column collation, work in progress
Next
From: Andrew Dunstan
Date:
Subject: Re: Per-column collation, work in progress