Re: Further pg_upgrade analysis for many tables - Mailing list pgsql-hackers
From | Jeff Janes |
---|---|
Subject | Re: Further pg_upgrade analysis for many tables |
Date | |
Msg-id | CAMkU=1zLgye1V7fxHcto-Z8eTw4kPSJJi8bxYrAZ10W9KSBOVw@mail.gmail.com Whole thread Raw |
In response to | Re: Further pg_upgrade analysis for many tables (Bruce Momjian <bruce@momjian.us>) |
Responses |
Re: Further pg_upgrade analysis for many tables
Re: Further pg_upgrade analysis for many tables |
List | pgsql-hackers |
On Fri, Nov 9, 2012 at 3:06 PM, Bruce Momjian <bruce@momjian.us> wrote: > On Thu, Nov 8, 2012 at 08:59:21PM -0800, Jeff Janes wrote: >> On Thu, Nov 8, 2012 at 4:33 PM, Bruce Momjian <bruce@momjian.us> wrote: >> > >> > I am actually now dumping git head/9.3, so I assume all the problems we >> > know about should be fixed. >> >> Are sure the server you are dumping out of is head? > > Well, I tested again with 9.2 dumping/loading 9.2 and the same for git > head, and got these results: > > pg_dump restore > 9.2 git 9.2 git > > 1 0.13 0.11 0.07 0.07 > 1000 4.37 3.98 4.32 5.28 > 2000 12.98 12.19 13.64 14.25 > 4000 47.85 50.14 61.31 70.97 > 8000 210.39 183.00 302.67 294.20 > 16000 901.53 769.83 1399.25 1359.09 For pg_dump, there are 4 possible combinations, not just two. you can use 9.2's pg_dump to dump from a 9.2 server, use git's pg_dump to dump from a 9.2 server, use git's pg_dump to dump from a git server, or use 9.2's pg_dump to dump from a git server (although that last one isn't very relevant) > > As you can see, there is very little difference between 9.2 and git > head, except maybe at the 16k level for pg_dump. > > Is there some slowdown with a mismatched version dump/reload? I am > attaching my test script. Sorry, from the script I can't really tell what versions are being used for what. > >> Using head's pg_dump, but 9.2.1 server, it takes me 179.11 seconds to >> dump 16,000 tables (schema only) like your example, and it is >> definitely quadratic. > > Are you using a SERIAL column for the tables. I am, and Peter > Eisentraut reported that was a big slowdown. Yes, I'm using the same table definition as your example. > >> But using head's pg_dump do dump tables out of head's server, it only >> took 24.95 seconds, and the quadratic term is not yet important, >> things still look linear. > > Again, using SERIAL? Yep. >> Is the next value, increment, etc. for a sequence stored in a catalog, >> or are they stored in the 8kb file associated with each sequence? If > > Each sequence is stored in its own 1-row 8k table: > > test=> CREATE SEQUENCE seq; > CREATE SEQUENCE > > test=> SELECT * FROM seq; > -[ RECORD 1 ]-+-------------------- > sequence_name | seq > last_value | 1 > start_value | 1 > increment_by | 1 > max_value | 9223372036854775807 > min_value | 1 > cache_value | 1 > log_cnt | 0 > is_cycled | f > is_called | f > >> they are stored in the file, than it is shame that pg_dump goes to the >> effort of extracting that info if pg_upgrade is just going to >> overwrite it anyway. > > Actually, pg_upgrade needs pg_dump to restore all those sequence values. I did an experiment where I had pg_dump just output dummy values rather than hitting the database. Once pg_upgrade moves the relation files over, the dummy values disappear and are set back to their originals. So I think that pg_upgrade depends on pg_dump only in a trivial way--they need to be there, but it doesn't matter what they are. Cheers, Jeff
pgsql-hackers by date: