Thread: How to restore a SQL-ASCII encoded database to a new UTF-8 db?
Hi, I have a database that was created with SQL-ASCII encoding (unfortunately). I ran pg_restore to load the struct and data into a new database with UTF-8 encoding but no surprise- I'm seeing this error for a number of tables: pg_restore: [archiver (db)] COPY failed: ERROR: invalid byte sequence for encod ing "UTF8" Any idea on how I can copy the data between these databases without any data loss? For some reason I thought that a conversion to Unicode would be easy. Thanks
> I have a database that was created with SQL-ASCII encoding > (unfortunately). I ran pg_restore to load the struct and data into a > new database with UTF-8 encoding but no surprise- I'm seeing this > error for a number of tables: > > pg_restore: [archiver (db)] COPY failed: ERROR: invalid byte > sequence for encod > ing "UTF8" > > Any idea on how I can copy the data between these databases without > any data loss? For some reason I thought that a conversion to Unicode > would be easy. Conversion to Unicode is easy if you know the encoding of your data and that is consistent :^) Try to figure out the encoding of your data. Then dump in text format and change the "SET client_encoding" line in the dump accordingly. Yours, Laurenz Albe
Postgres User wrote: > Hi, > > I have a database that was created with SQL-ASCII encoding > (unfortunately). I ran pg_restore to load the struct and data into a > new database with UTF-8 encoding but no surprise- I'm seeing this > error for a number of tables: > > pg_restore: [archiver (db)] COPY failed: ERROR: invalid byte sequence for encod > ing "UTF8" > > Any idea on how I can copy the data between these databases without > any data loss? For some reason I thought that a conversion to Unicode > would be easy. Provided you haven't actually any characters from different character sets or invalid characters in the dump, you may be able to import it just by changing the client encoding in the dump. There's probably a line saying something like "SET CLIENT_ENCODING=SQL-ASCII;" If you change that to "SET CLIENT_ENCODING=Whatever_encoding_your_data_is_in;" You may be able to import it. IIRC, PostgreSQL doesn't do any automatic conversion between SQL-ASCII <-> Any encoding, but if you put the correct encoding, PostgreSQL will deal with the conversion automatically. -- Tommy Gildseth DBA, Gruppe for databasedrift Universitetet i Oslo, USIT m: +47 45 86 38 50 t: +47 22 85 29 39
You can do this by converting the characters in raw dump file directly. iconv -f 8859_1 -t UTF-8 backup.db.psql > backup.db.psql.utf8 Then convert the line in backup.db.psql.utf8 from: SET client_encoding = 'SQL_ASCII'; to: SET client_encoding = 'UTF8';
View this message in context: Re: How to restore a SQL-ASCII encoded database to a new UTF-8 db?
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
View this message in context: Re: How to restore a SQL-ASCII encoded database to a new UTF-8 db?
Sent from the PostgreSQL - general mailing list archive at Nabble.com.