Encoding problems with "COMMENT ON DATABASE .." causing pg_restore (and pg_upgrade) to fail - Mailing list pgsql-bugs
| From | Jan Lentfer |
|---|---|
| Subject | Encoding problems with "COMMENT ON DATABASE .." causing pg_restore (and pg_upgrade) to fail |
| Date | |
| Msg-id | 666abab3fe03561c77dbfc661b023985@neslonek.homeunix.org Whole thread Raw |
| Responses |
Re: Encoding problems with "COMMENT ON DATABASE .." causing
pg_restore (and pg_upgrade) to fail
|
| List | pgsql-bugs |
We found a problem with comments on databases using german umlauts (=C3=A4=
=C3=B6=C3=BC)=20
- or more generally, we found encoding problems when doing "COMMENT ON=20
database foo ..." when foo is UTF-8 encoded and the database you are=20
connected to (e.g. postgres) is SQL_ASCII.
I analyzed this with the help of Andrew Gierth (and others) on IRC - it=20
seems in that constellation you can write an non-UTF-8 comment onto a=20
UTF-8 database.
This leads to the problem, that wen trying to use pg_restore -C with a=20
dump created with -Fc the restore will fail when trying to do the=20
"COMMENT ON DATABASE.."
We stumpled across this because also pg_upgrade fails on this (during=20
the schema part), but doesn't detect this situation with the -c (check)=20
option beforehand.
This is 9.4.5 on Solaris 11 - but we did have that problem already when=20
using pg_upgrade from 8.4 to 9.1 a few years back. I am sorry we didn't=20
analyze further back then, but it was "just a comment". But now it bit=20
us again....
Here is what I did so far to analyze the problem (with instrucions from=20
Andrew Gierth). I hope this makes it clear, otherwise please don't=20
hesitate to request more details.
postgres=3D# create database comment_test template template0 encoding=20
'UTF-8';
CREATE DATABASE
#####
---> use pgadmin3 on Windows to set comment on database coment_test with=20
string "f=C3=BCr", while being connected to postgres (SQL_ASCII encoded)
#####
$ pg_dump -Fc comment_test -f comment_test.pgdump
postgres=3D# alter database comment_test rename to comment_test_orig;
ALTER DATABASE
#####
$ pg_restore -C -d template1 comment_test.pgdump
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 1989; 1262 641528=20
COMMENT comment_test postgres
pg_restore: [archiver (db)] could not execute query: ERROR: invalid=20
byte sequence for encoding "UTF8": 0xfc
Command was: COMMENT ON DATABASE comment_test IS 'f=C2=A6r';
###
postgres@sz-pp-pg02-dev[~] $ pg_restore comment_test.pgdump
--
-- PostgreSQL database dump
--
SET statement_timeout =3D 0;
SET lock_timeout =3D 0;
SET client_encoding =3D 'UTF8';
SET standard_conforming_strings =3D on;
SET check_function_bodies =3D false;
SET client_min_messages =3D warning;
--
-- Name: comment_test; Type: COMMENT; Schema: -; Owner: postgres
--
COMMENT ON DATABASE comment_test IS 'f=C2=A6r';
--
-- Name: plpgsql; Type: EXTENSION; Schema: -; Owner:
--
CREATE EXTENSION IF NOT EXISTS plpgsql WITH SCHEMA pg_catalog;
--
-- Name: EXTENSION plpgsql; Type: COMMENT; Schema: -; Owner:
--
COMMENT ON EXTENSION plpgsql IS 'PL/pgSQL procedural language';
--
-- Name: public; Type: ACL; Schema: -; Owner: postgres
--
REVOKE ALL ON SCHEMA public FROM PUBLIC;
REVOKE ALL ON SCHEMA public FROM postgres;
GRANT ALL ON SCHEMA public TO postgres;
GRANT ALL ON SCHEMA public TO PUBLIC;
--
-- PostgreSQL database dump complete
--
#######
$ pg_restore comment_test.pgdump | head -16 | tail -1 |xxd
0000000: 434f 4d4d 454e 5420 4f4e 2044 4154 4142 COMMENT ON DATAB
0000010: 4153 4520 636f 6d6d 656e 745f 7465 7374 ASE comment_test
0000020: 2049 5320 2766 fc72 273b 0a IS 'f.r';.
######
postgres=3D# select=20
convert_to(shobj_description(oid,'pg_database'),'SQL_ASCII') from=20
pg_database where datname=3D'comment_test_orig';
convert_to
------------
\x66fc72
(1 row)
######
postgres=3D# \l
List of databases
Name | Owner | Encoding | Collate | =20
Ctype | Access privileges
--------------------+-----------+-----------+------------------+---------=
---------+-----------------------
comment_test | postgres | UTF8 | C | C =
=20
|
[...]
postgres | postgres | SQL_ASCII | C | C =
=20
|
template0 | postgres | SQL_ASCII | C | C =
=20
| =3Dc/postgres +
| | | | =
=20
| postgres=3DCTc/postgres
template1 | postgres | SQL_ASCII | C | C =
=20
| postgres=3DCTc/postgres+
| | | | =
=20
| =3Dc/postgres
pgsql-bugs by date: