Re: Supporting SJIS as a database encoding - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: Supporting SJIS as a database encoding
Date
Msg-id 20160907.161304.112519789.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: Supporting SJIS as a database encoding  ("Tsunakawa, Takayuki" <tsunakawa.takay@jp.fujitsu.com>)
Responses Re: Supporting SJIS as a database encoding
Re: Supporting SJIS as a database encoding
List pgsql-hackers
Hello,

At Tue, 6 Sep 2016 03:43:46 +0000, "Tsunakawa, Takayuki" <tsunakawa.takay@jp.fujitsu.com> wrote in
<0A3221C70F24FB45833433255569204D1F5E66CE@G01JPEXMBYT05>
> > From: pgsql-hackers-owner@postgresql.org
> > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Kyotaro
> > HORIGUCHI
> Implementing radix tree code, then redefining the format of mapping table
> > to suppot radix tree, then modifying mapping generator script are needed.
> > 
> > If no one oppse to this, I'll do that.
> 
> +100
> Great analysis and your guts.  I very much appreciate your trial!

Thanks, by the way, there's another issue related to SJIS
conversion.  MS932 has several characters that have multiple code
points. By converting texts in this encoding to and from Unicode
causes a round-trop problem. For example,

8754(ROMAN NUMERICAL I in NEC specials) => U+2160(ROMAN NUMERICAL I)   => FA4A (ROMAN NUMERICA I in IBM extension)

My counting said that 398 characters are affected by this kind of
replacement. Addition to that, "GAIJI" (Private usage area) is
not allowed. Is this meet your purpose?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center





pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Optimization for lazy_scan_heap
Next
From: Craig Ringer
Date:
Subject: Re: patch: function xmltable