Re: Scriptable way to validate a pg_dump restore ? - Mailing list pgsql-general
From | Adrian Klaver |
---|---|
Subject | Re: Scriptable way to validate a pg_dump restore ? |
Date | |
Msg-id | e0fa9915-9356-45e6-891f-d24b6a274e7e@aklaver.com Whole thread Raw |
In response to | Scriptable way to validate a pg_dump restore ? (Laura Smith <n5d9xq3ti233xiyif2vp@protonmail.ch>) |
Responses |
Re: Scriptable way to validate a pg_dump restore ?
|
List | pgsql-general |
On 1/29/24 00:12, Laura Smith wrote: > Hi > > Let's say I've got a scenario where I'm doing a pg_dump replication rather than online streaming, e.g. due to air-gap orwhatever. > > Is there a scriptable way to validate the restore ? e.g. using doing something clever with ctid or something to ensureboth the schema and all its rows were restored to the same point at which the dump was taken ? Assuming you are using pg_restore on a non-plain text dump file then from pg_restore.c: * pg_restore.c * pg_restore is an utility extracting postgres database definitions * from a backup archive created by pg_dump using the archiver * interface. * * pg_restore will read the backup archive and * dump out a script that reproduces * the schema of the database in terms of * user-defined types * user-defined functions * tables * indexes * aggregates * operators * ACL - grant/revoke * * the output script is SQL that is understood by PostgreSQL * * Basic process in a restore operation is: * * Open the Archive and read the TOC. * Set flags in TOC entries, and *maybe* reorder them. * Generate script to stdout * Exit Then: pg_restore -l -f <output_file> <dump_file> to get the TOC mentioned above. Walk through that to verify schema is the same in the restored database. This will not tell you whether all the data was transferred. You will either have to trust from pg_dump.c: * pg_dump will read the system catalogs in a database and dump out a * script that reproduces the schema in terms of SQL that is understood * by PostgreSQL * * Note that pg_dump runs in a transaction-snapshot mode transaction, * so it sees a consistent snapshot of the database including system * catalogs. However, it relies in part on various specialized backend * functions like pg_get_indexdef(), and those things tend to look at * the currently committed state. So it is possible to get 'cache * lookup failed' error if someone performs DDL changes while a dump is * happening. The window for this sort of thing is from the acquisition * of the transaction snapshot to getSchemaData() (when pg_dump acquires * AccessShareLock on every table it intends to dump). It isn't very large, * but it can happen. Or come up with way to capture the state of the data at the time of dump and then compare to restored database. Something like Ron posted. > > Thanks ! > > Laura > > -- Adrian Klaver adrian.klaver@aklaver.com
pgsql-general by date: