Re: pg_dump additional options for performance - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: pg_dump additional options for performance
Date
Msg-id 1204027540.4252.251.camel@ebony.site
Whole thread Raw
In response to Re: pg_dump additional options for performance  (Dimitri Fontaine <dfontaine@hi-media.com>)
Responses Re: pg_dump additional options for performance
Re: pg_dump additional options for performance
List pgsql-hackers
On Tue, 2008-02-26 at 12:46 +0100, Dimitri Fontaine wrote:
> Le mardi 26 février 2008, Simon Riggs a écrit :
> > So that would mean we would run an unload like this
> >
> > pg_dump --pre-schema-file=f1 --save-snapshot -snapshot-id=X
> > pg_dump -t bigtable --data-file=f2.1 --snapshot-id=X
> > pg_dump -t bigtable2 --data-file=f2.2 --snapshot-id=X
> > pg_dump -T bigtable -T bigtable2 --data-file=f2.3 --snapshot-id=X
> 
> As a user I'd really prefer all of this to be much more transparent, and could 
> well imagine the -Fc format to be some kind of TOC + zip of table data + post 
> load instructions (organized per table), or something like this.
> In fact just what you described, all embedded in a single file.

If its in a single file then it won't perform as well as if its separate
files. We can put separate files on separate drives. We can begin
reloading one table while another is still unloading. The OS will
perform readahead for us on single files whereas on one file it will
look like random I/O. etc.

I'm not proposing we change things to use separate files in all cases.
Just when you want to use separate files, you can.

> And I'd much prefer it if this (new?) format was trustworthy enough to be the 
> new default format of -Fc dumps. Then we could add some *simple* command line 
> parameter to control the threading behavior of the dump and reload process, 
> ala make -j. We could even support some option for the user to tell us which 
> disk arrays to use for parallel dumping.
> 
>  pg_dump -j2 --dumpto=/mount/sda:/mount/sdb ... > mydb.dump
>  pg_restore -j4 ... mydb.dump

I like the -j syntax.

--  Simon Riggs 2ndQuadrant  http://www.2ndQuadrant.com 



pgsql-hackers by date:

Previous
From: Dimitri Fontaine
Date:
Subject: Re: pg_dump additional options for performance
Next
From: "Tom Dunstan"
Date:
Subject: Re: pg_dump additional options for performance