Re: MSSQL to PostgreSQL : Encoding problem - Mailing list pgsql-general
From | Arnaud Lesauvage |
---|---|
Subject | Re: MSSQL to PostgreSQL : Encoding problem |
Date | |
Msg-id | 45645FFA.2040006@freesurf.fr Whole thread Raw |
In response to | Re: MSSQL to PostgreSQL : Encoding problem (Alvaro Herrera <alvherre@commandprompt.com>) |
Responses |
Re: MSSQL to PostgreSQL : Encoding problem
|
List | pgsql-general |
Alvaro Herrera a écrit : > Arnaud Lesauvage wrote: >> Alvaro Herrera a écrit : >> >Arnaud Lesauvage wrote: >> >>Tomi NA a écrit : >> >>>>I think I'll go this way... No other choice, actually ! >> >>>>The MSSQL database is in SQL_Latin1_General_CP1_Cl_AS. >> >>>>I don't really understand what this is. It supports the euro >> >>>>symbol, so it is probably not pure LATIN1, right ? >> >>> >> >>>I suppose you'd have to look at the latin1 codepage character table >> >>>somewhere...I'm a UTF-8 guy so I'm not well suited to respond to the >> >>>question. :) >> >> >> >>Yep, http://en.wikipedia.org/wiki/Latin-1 tells me that >> >>LATIN1 is missing the euro sign... >> >>Grrrrr I hate this !!! >> > >> >So use Latin9 ... >> >> Of course, but it doesn't work !!! >> Whatever client encoding I choose in postgresql before >> COPYing, I get the 'invalid byte sequence error'. > > Humm ... how are you choosing the client encoding? Is it actually > working? I don't see how choosing Latin1 or Latin9 and feeding whatever > byte sequence would give you an "invalid byte sequence". These charsets > don't have any way to validate the bytes, as opposed to what UTF-8 can > do. So you could end up with invalid bytes if you choose the wrong > client encoding, but that's a different error. > mydb=# SET client_encoding TO LATIN9; SET mydb=# COPY statistiques.detailrecherche (log_gid, champrecherche, valeurrecherche) FROM 'E:\\Production\\Temp\\detailrecherche_ansi.csv' CSV; ERROR: invalid byte sequence for encoding "LATIN9": 0x00 HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". CONTEXT: COPY detailrecherche, line 9212 mydb=# SET client_encoding TO WIN1252; SET mydb=# COPY statistiques.detailrecherche (log_gid, champrecherche, valeurrecherche) FROM 'E:\\Production\\Temp\\detailrecherche_ansi.csv' CSV; ERROR: invalid byte sequence for encoding "WIN1252": 0x00 HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". CONTEXT: COPY detailrecherche, line 9212 Really, I'd rather have another error, but this is all I can get. This is with the "ANSI" export. With the "UNICODE" export : mydb=# SET client_encoding TO UTF8; SET mydb=# COPY statistiques.detailrecherche (log_gid, champrecherche, valeurrecherche) FROM 'E:\\Production\\Temp\\detailrecherche_unicode.csv' CSV; ERROR: invalid byte sequence for encoding "UTF8": 0xff HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". CONTEXT: COPY detailrecherche, line 592680 So, line 592680 is *a lot* better, but it is still not good! -- Arnaud
pgsql-general by date: