RE: About Unicode IVS - Mailing list pgsql-admin
From | Graham Myers |
---|---|
Subject | RE: About Unicode IVS |
Date | |
Msg-id | d60efdf8caa7379a7483cd530ba5098e@mail.gmail.com Whole thread Raw |
In response to | RE: About Unicode IVS (荒井元成 <n2029@ndensan.co.jp>) |
Responses |
RE: About Unicode IVS
|
List | pgsql-admin |
Thanks you for the explanation, Unicode always blows my mind 😊 The problems is that postgres is counting code points which in your example is two.
|
From: 荒井元成 <n2029@ndensan.co.jp>
Sent: 29 March 2022 09:21
To: 'Graham Myers' <gmyers@retailexpress.com>; 'David G. Johnston' <david.g.johnston@gmail.com>
Cc: pgsql-admin@lists.postgresql.org
Subject: RE: About Unicode IVS
thank you for your reply.
This is because two characters display one character.
This includes Unicode Variant Selectors and Combining Characters.
Moto.
From: Graham Myers <gmyers@retailexpress.com>
Sent: Tuesday, March 29, 2022 4:46 PM
To: 荒井元成 <n2029@ndensan.co.jp>; David G. Johnston <david.g.johnston@gmail.com>
Cc: pgsql-admin@lists.postgresql.org
Subject: RE: About Unicode IVS
Why do you expect the concatenation of two characters to return a length of one?
|
From: 荒井元成 <n2029@ndensan.co.jp>
Sent: 29 March 2022 05:35
To: 'David G. Johnston' <david.g.johnston@gmail.com>
Cc: pgsql-admin@lists.postgresql.org
Subject: RE: About Unicode IVS
thank you for your reply.
It will be 2 characters.
select char_length(U&'\+008FBA' || U&'\+0E0102');
char_length
-------------
2
(1 行)
select length('辺󠄂');
length
--------
2
(1 行)
select char_length('辺󠄂');
char_length
-------------
2
(1 行)
$ psql -l
データベース一覧
名前 | 所有者 | エンコーディング | 照合順序 | Ctype(変換演算子) | アクセス権限
-----------+---------+------------------+----------+-------------------+---------------------
D209007 | D209007 | UTF8 | C | C |
postgres | D209007 | UTF8 | C | C |
template0 | D209007 | UTF8 | C | C | =c/D209007 +
| | | | | D209007=CTc/D209007
template1 | D209007 | UTF8 | C | C | =c/D209007 +
| | | | | D209007=CTc/D209007
(4 行)
$ cat pgdata/PG_VERSION
13
Moto.
From: David G. Johnston <david.g.johnston@gmail.com>
Sent: Tuesday, March 29, 2022 12:38 PM
To: 荒井元成 <n2029@ndensan.co.jp>
Cc: pgsql-admin@lists.postgresql.org
Subject: Re: About Unicode IVS
On Monday, March 28, 2022, 荒井元成 <n2029@ndensan.co.jp> wrote:
Hi,
In the Length () function, it will be 2 characters where you want it to be 1 character.
Is it possible to respond by changing the settings such as changing the collation setting like SQL Server?
Also, if you understand how to deal with it (eg, create your own function), it would be helpful if you could provide as much information as you can.
Try char_length(text) instead.
David J.
Attachment
pgsql-admin by date: