Thread: pg_dump behaves differently for different archive formats

pg_dump behaves differently for different archive formats

From

Albe Laurenz

Date:

16 December 2013, 11:30:30

Restoring a "plain format" dump and a "custom format" dump of
the same database can lead to different results:

pg_dump organizes the SQL statements it creates in "TOC entries".
If a custom format dump is restored with pg_restore, all
SQL statements in a TOC entry will be executed as a single command
and thus in a single transaction.
On the other hand, each SQL statement in a plain format dump
is executed individually in its own transaction, and TOC entries
are irrelevant (except as comments for documentation).

E.g., if a table has ACL entries for several roles and one of
them is not present in the destination database, a plain format
dump will restore all privileges except the ones that pertain
to the missing user, while a custom format dump will not restore
any privileges even for existing users.
This is because all ACL related statements are in one TOC entry.

Another example is a table that you try to restore into a database
where the original table owner does not exist.
With a plain format dump, the table is created, but will belong
to the user restoring the dump, while a custom format dump will
not create the table at all.
This is because CREATE TABLE and ALTER TABLE ... OWNER TO
are in the same TOC entry.

One can argue for or against each individual behaviour, but I
am surprised by the difference.

Is there a deeper reason why it should remain like this or should
I consider it a bug that should get fixed?

Yours,
Laurenz Albe

Re: pg_dump behaves differently for different archive formats

From

Tom Lane

Date:

16 December 2013, 13:17:00

Albe Laurenz <laurenz.albe@wien.gv.at> writes:
> Restoring a "plain format" dump and a "custom format" dump of
> the same database can lead to different results:

> pg_dump organizes the SQL statements it creates in "TOC entries".
> If a custom format dump is restored with pg_restore, all
> SQL statements in a TOC entry will be executed as a single command
> and thus in a single transaction.

Yeah, this is a bug I think.  pg_dump was designed around the idea
that the output would be executed as a simple script, and in a
number of places there's an expectation that one SQL statement
can fail without affecting following ones.  So if pg_restore can't
provide that behavior it's not good.

On the other hand, I'm not sure how much enthusiasm there'd be for
complex or fragile changes to fix this.  A lot of people invariably
run restores in single-transaction mode and don't really care about
fault-tolerant restores.  Also, it's easy enough to dodge the problem
if you must: just pipe the output into psql rather than
direct-to-database.

So to me the question is can we fix this without doing something like
duplicating psql's lexer?  If we have to parse out the statements
contained in each text blob, it's probably going to be too painful.
Some cautionary history about this sort of thing can be read at
http://www.postgresql.org/message-id/flat/18006.1325700782@sss.pgh.pa.us

            regards, tom lane