Thread: initdb failes on Traditional chinese machine when postgres install directory contains chinese characters.

Hi,

   I downloaded postgres 8.4 in zip format and installed it under c:\postgres�裘� on a traditional Chinese windows 2003 server. Note the Chinese characters in the folder name. Then I tried to create a database using initdb. I specified the following command:

 

initdb.exe --encoding UTF-8 -D  c:\mydb\db --username user1 �CW �CL c:\postgres�裘�\share

 

The command failed with the following output. It failed while loading system objects description. The command works fine if I install postgres on a folder containg only ascii characters.

 

The files belonging to this database system will be owned by user "Administrator

".

This user must also own the server process.

 

The database cluster will be initialized with locale Chinese_Taiwan.950.

initdb: could not find suitable text search configuration for locale Chinese_Tai

wan.950

The default text search configuration will be set to "simple".

 

creating directory c:/BCA-Networks-Data/db ... ok

creating subdirectories ... ok

selecting default max_connections ... 100

selecting default shared_buffers/max_fsm_pages ... 32MB/204800

creating configuration files ... ok

creating template1 database in c:/BCA-Networks-Data/db/base/1 ... ok

initializing pg_authid ... ok

Enter new superuser password:

Enter it again:

setting password ... ok

initializing dependencies ... ok

creating system views ... ok

loading system objects' descriptions ... FATAL:  invalid byte sequence for encoding "UTF8": 0xa5

HINT:  This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".

child process exited with exit code 1

initdb: removing data directory "c:/mydb/db"

 

Can someone please help me to solve this issue?

Thanks

Sudipta

On 23/01/2010 5:19 AM, Sarkar, Sudipta wrote:
> Hi,
>
> I downloaded postgres 8.4 in zip format and installed it under
> c:\postgres用�裘� on a traditional Chinese windows 2003 server. Note the
> Chinese characters in the folder name. Then I tried to create a database
> using initdb. I specified the following command:
>
> initdb.exe --encoding UTF-8 -D c:\mydb\db --username user1 �CW �CL
> c:\postgres用�裘�\share

Hi

I'd like to try to reproduce this issue, but as I don't have a Chinese
localized Windows install I can't use the appropriate characters on the
console.

Is there any way you know of to switch Windows' locale in a cmd.exe
(console) window so you can use other locale's charsets?  What is the
name of the encoding Windows uses on your system? I know how to do all
this stuff in Linux, but everything language/locale related seems to be
painfully hard, expensive, and complicated under Windows.

Windows (except Vista Ultimate and 7 Ultimate) doesn't offer the option
to change languages for the system (MUI) and the language interface
packs (LIP) only work on top of a particular base language and don't
support major languages. I can't really install a Chinese windows VM, as
I *really* don't have the language skills to navigate around it and test
with it.

Anyway, what I expect is happening here is that initdb is assuming that
the path is in the database system encoding, where it's actually in the
system's native encoding. If you're using a path that is valid in both
encodings (ie each byte means the same thing) then you get away with it,
which is why ASCII works.

Most likely initdb needs to set client_encoding at some point where it's
forgetting to.

--
Craig Ringer