Home > mailing lists

Re: Proposal: More flexible backup/restore via pg_dump - Mailing list pgsql-hackers

From	Bruce Momjian
Subject	Re: Proposal: More flexible backup/restore via pg_dump
Date	October 10, 2000 20:59:29
Msg-id	200010110049.UAA18367@candle.pha.pa.us Whole thread Raw
In response to	Re: Proposal: More flexible backup/restore via pg_dump (Philip Warner <pjw@rhyme.com.au>)
Responses	Re: Proposal: More flexible backup/restore via pg_dump
List	pgsql-hackers

Tree view

Can I ask on a status?

> At 10:17 26/06/00 +0200, Zeugswetter Andreas SB wrote:
> >
> >A problem I see with an index at file end is, that you will need to read the
> >file twice, and that may be very undesireable if e.g the backup is on tape
> >or a compressed file.
> 
> The proposal has actually come a fairly long way after extensive
> discussions with Tom Lane, and I have added the current plans at the end of
> this message. The TOC-at-end problem is an issue that I am trying to deal
> with; I am planning a 'custom' format that has the large parts (data dumps)
> compressed, to avoid the need of compressing the entire file. This means
> that you would not need to uncompress the entire file to get to the TOC, or
> to restore just the schema. It also allows good random access to defns and
> data. I'm also considering putting the dumped data at the end of the file,
> but this has issues when you want to restore table data before defining
> indexes, for example.
> 
> I must admit that I've been working on the assumption that people using
> PostgreSQL don't have multi-GB (compressed) database dumps, so that (in
> theory) a restore can be loaded onto disk from tape before being used. I
> know this is pretty evil, but it will cover 95% of users. For those people
> with huge backups, they will have to suffer tapes that go backward and
> forwards a bit. From the details below, you will see that this is unavoidable.
> 
> Sanity Check: does fseek work on tapes? If not, what is the correct way to
> read a particular block/byte from a file on a tape?
> 
> -----------------------------------------------------------
> Updated Proposal:
> -------------------------
> 
> For the sake of argument, call the new utilities pg_backup and pg_restore.
> 
> pg_backup
> ---------
> 
> Dump schema [and data] in OID order (to try to make restores sequential,
> for when tar/tape storage is used). Each dumped item has a TOC entry which
> includes the OID and description, and for those items for which we know
> some dependencies (functions for types & aggregates; types for tables;
> superclasses for classes; - any more?), it will also dump the dependency OIDs.
> 
> Each object (table defn, table data, function defn, type defn etc) is
> dumped to a separate file/thing in the output file. The TOC entries go into
> a separate file/thing (probably only one file/thing for the whole TOC).
> 
> The output scheme will be encapsulated, and in the initial version will be
> a custom format (since I can't see an API for tar files), and a
> dump-to-a-directory format. Future use of tar, DB, PostgreSQL or even a
> Make file should not be excluded in the IO design. This last goal *may* not
> be achieved, but I don't see why it can't be at this stage. Hopefully
> someone with appropriate skills & motivation can do a tar archive 8-}.
> 
> The result of a pg_backup should be a single file with metadata and
> optional data, along with whatever dependency and extra data is available
> pg_backup, or provided by the DBA.
> 
> 
> pg_restore
> ----------
> 
> Reads a backup file and dumps SQL suitable for sending to psql.
> 
> Options will include:
> 
> - No Data (--no-data? -nd? -s?)
> - No metadata (--no-schema? -ns? -d?)
> - Specification of items to dump from an input file; this allows custom
> ordering AND custom selection of multiple items. Basically, I will allow
> the user to dump part of the TOC, edit it, and tell pg_restore to use the
> edited partial TOC. (--item-list=<file>? -l=<file>?)
> - Dump TOC (--toc-only? -c?)
> 
> [Wish List]
> - Data For a single table (--table=<name>? -t=<name>)
> - Defn/Data for a single OID; (--oid=<oid>? -o=<oid>?)
> - User definied dependencies. Allow the DB developer to specify once for
> thier DB what the dependencies are, then use that files as a guide to the
> restore process. (--deps=<file> -D=<file>)
> 
> pg_restore will use the same custom IO routines to allow IO to
> tar/directory/custom files. In the first pass, I will do custom file IO.
> 
> If a user selects to restore the entire metadata, then it will be dumped
> according to the defaul policy (OID order). If they select to specify the
> items from an input file, then the file ordering is used.
> 
> 
> -------
> 
> Typical backup procedure:
> 
>     pg_backup mydb mydb.bkp
> 
> or *maybe* 
> 
>     pg_backup mydb > mydb.bkp
> 
> BUT AFAIK, fseek does not work on STDOUT, and at the current time pg_backup
> will use fseek.
> 
> 
> Typical restore procedure:
> 
>     pg_restore mydb mydb.bkp | psql 
> 
> A user will be able to extract only the schema (-s), only the data (-d), a
> specific table (-t=name), or even edit the object order and selection via:
> 
>     pg_restore --dump-toc mydb.bkp > mytoc.txt
>     vi mytoc.txt   {ie. reorder TOC elements as per known dependency problems}
>     pg_restore --item-list=mytoc.txt mydb.bkp | psql
> 
> FWIW, I envisage the ---dump-toc output to look like:
> 
> ID; FUNCTION FRED(INT4)
> ID; TYPE MY_TYPE
> ID; TABLE MY_TABLE
> ID; DATA MY_TABLE
> ID; INDEX MY_TABLE_IX1
> ...etc.
> 
> so editing and reordering the dump plan should not be too onerous.
> 
> 
> ----------------------------------------------------------------
> Philip Warner                    |     __---_____
> Albatross Consulting Pty. Ltd.   |----/       -  \
> (A.C.N. 008 659 498)             |          /(@)   ______---_
> Tel: (+61) 0500 83 82 81         |                 _________  \
> Fax: (+61) 0500 83 82 82         |                 ___________ |
> Http://www.rhyme.com.au          |                /           \|
>                                  |    --________--
> PGP key available upon request,  |  /
> and from pgp5.ai.mit.edu:11371   |/
> 


--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

pgsql-hackers by date:

From: Bruce Momjian
Date: 10 October 2000, 20:53:37
Subject: Re: [GENERAL] Re: My new job

From: Bruce Momjian
Date: 10 October 2000, 21:02:58
Subject: Re: 7.0.2 on Solaris

Re: Proposal: More flexible backup/restore via pg_dump - Mailing list pgsql-hackers

Previous

Next