Re: Built-in CTYPE provider - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Re: Built-in CTYPE provider
Date
Msg-id 27bb0e52-801d-4f73-a0a4-02cfdd4a9ada@eisentraut.org
Whole thread Raw
In response to Re: Built-in CTYPE provider  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: Built-in CTYPE provider
List pgsql-hackers
On 13.02.24 03:01, Jeff Davis wrote:
> 1. The SQL spec mentions the capitalization of "ß" as "SS"
> specifically. Should UCS_BASIC use the unconditional mappings in
> SpecialCasing.txt? I already have some code to do that (not posted
> yet).

It is my understanding that "correct" Unicode case conversion needs to 
use at least some parts of SpecialCasing.txt.  The header of the file says

"For compatibility, the UnicodeData.txt file only contains simple case 
mappings for characters where they are one-to-one and independent of 
context and language. The data in this file, combined with the simple 
case mappings in UnicodeData.txt, defines the full case mappings [...]"

I read this as, just using UnicodeData.txt by itself is incomplete.

I think we need to use the "Unconditional" mappings and the "Conditional 
Language-Insensitive" mappings (which is just Greek sigma).  Obviously, 
skip the "Language-Sensitive" mappings.



pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Do away with zero-padding assumption before WALRead()
Next
From: Amit Kapila
Date:
Subject: Re: Synchronizing slots from primary to standby