Thread: pg_upgrade --jobs

pg_upgrade --jobs

From

senor

Date:

06 April 2019, 18:44:31

The pg_upgrade --jobs option is not passed as an argument when it calls pg_dump. I haven't found anything in docs or
forumsmentioning a reason for not supporting under certain circumstances other than possibly for pre-9.2. The
pg_upgradedocs page states that it allows multiple CPUs to be used for dump and reload of schemas. Some databases I'm
upgradinghave 500,000+ tables and running with a single process is greatly increasing the upgrade time. 

I am also using the --link option.
I have tried "--jobs 20", "--jobs=20", placing this option first and last and many other variations.
I am upgrading 9.2.4 to 9.6.12 on CentOS 6.
Varying hardware but all with 32+ CPU cores.

su - postgres -c "/usr/pgsql-9.6/bin/pg_upgrade --jobs=20 --link \
       --old-bindir=/usr/pgsql-9.2/bin/ \
       --new-bindir=/usr/pgsql-9.6/bin/ \
       --old-datadir=/var/lib/pgsql/9.2/data/ \
       --new-datadir=/var/lib/pgsql/9.6/data/"

I feel like there's a simple reason I've missed but this seems pretty straight forward.
A secondary plan would be to find instructions for doing the same as "pg_upgrade --link" manually so I can run "pg_dump
--jobs20".  
Any assist is appreciated.
Thanks,
S. Cervesa

Re: pg_upgrade --jobs

From

Adrian Klaver

Date:

06 April 2019, 20:52:30

On 4/6/19 11:44 AM, senor wrote:
> The pg_upgrade --jobs option is not passed as an argument when it calls pg_dump. I haven't found anything in docs or
forumsmentioning a reason for not supporting under certain circumstances other than possibly for pre-9.2. The
pg_upgradedocs page states that it allows multiple CPUs to be used for dump and reload of schemas. Some databases I'm
upgradinghave 500,000+ tables and running with a single process is greatly increasing the upgrade time.
 
> 
> I am also using the --link option.
> I have tried "--jobs 20", "--jobs=20", placing this option first and last and many other variations.
> I am upgrading 9.2.4 to 9.6.12 on CentOS 6.
> Varying hardware but all with 32+ CPU cores.
> 
> su - postgres -c "/usr/pgsql-9.6/bin/pg_upgrade --jobs=20 --link \
>         --old-bindir=/usr/pgsql-9.2/bin/ \
>         --new-bindir=/usr/pgsql-9.6/bin/ \
>         --old-datadir=/var/lib/pgsql/9.2/data/ \
>         --new-datadir=/var/lib/pgsql/9.6/data/"
> 
> I feel like there's a simple reason I've missed but this seems pretty straight forward.

https://www.postgresql.org/docs/9.6/app-pgdump.html

"--jobs=njobs

     Run the dump in parallel by dumping njobs tables simultaneously. 
This option reduces the time of the dump but it also increases the load 
on the database server. You can only use this option with the directory
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
output format because this is the only output format where multiple
^^^^^^^^^^^^^
processes can write their data at the same time."


> A secondary plan would be to find instructions for doing the same as "pg_upgrade --link" manually so I can run
"pg_dump--jobs 20".
 
> Any assist is appreciated.
> Thanks,
> S. Cervesa
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: pg_upgrade --jobs

From

senor

Date:

06 April 2019, 21:26:46

Thank you for responding. I did see that note and should have included that as part of my question.

Since pg_upgrade is in control of how it is calling pg_dump, is there a reason pg_upgrade cannot use the directory
outputformat when calling pg_dump? Is the schema-only operation incompatible? 

________________________________________
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Saturday, April 6, 2019 1:52 PM
To: senor; pgsql-general@lists.postgresql.org
Subject: Re: pg_upgrade --jobs

On 4/6/19 11:44 AM, senor wrote:
> The pg_upgrade --jobs option is not passed as an argument when it calls pg_dump. I haven't found anything in docs or
forumsmentioning a reason for not supporting under certain circumstances other than possibly for pre-9.2. The
pg_upgradedocs page states that it allows multiple CPUs to be used for dump and reload of schemas. Some databases I'm
upgradinghave 500,000+ tables and running with a single process is greatly increasing the upgrade time. 
>
> I am also using the --link option.
> I have tried "--jobs 20", "--jobs=20", placing this option first and last and many other variations.
> I am upgrading 9.2.4 to 9.6.12 on CentOS 6.
> Varying hardware but all with 32+ CPU cores.
>
> su - postgres -c "/usr/pgsql-9.6/bin/pg_upgrade --jobs=20 --link \
>         --old-bindir=/usr/pgsql-9.2/bin/ \
>         --new-bindir=/usr/pgsql-9.6/bin/ \
>         --old-datadir=/var/lib/pgsql/9.2/data/ \
>         --new-datadir=/var/lib/pgsql/9.6/data/"
>
> I feel like there's a simple reason I've missed but this seems pretty straight forward.

https://www.postgresql.org/docs/9.6/app-pgdump.html

"--jobs=njobs

     Run the dump in parallel by dumping njobs tables simultaneously.
This option reduces the time of the dump but it also increases the load
on the database server. You can only use this option with the directory
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
output format because this is the only output format where multiple
^^^^^^^^^^^^^
processes can write their data at the same time."


> A secondary plan would be to find instructions for doing the same as "pg_upgrade --link" manually so I can run
"pg_dump--jobs 20". 
> Any assist is appreciated.
> Thanks,
> S. Cervesa
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: pg_upgrade --jobs

From

Tom Lane

Date:

06 April 2019, 22:02:05

senor <frio_cervesa@hotmail.com> writes:
> Since pg_upgrade is in control of how it is calling pg_dump, is there a reason pg_upgrade cannot use the directory
outputformat when calling pg_dump? Is the schema-only operation incompatible? 

Well, there's no point in it.  pg_dump can only parallelize data dumping,
and there's none to be done in the --schema-only case that pg_upgrade
uses.

Also, since pg_upgrade *does* use parallelism across multiple pg_dump
calls (if you've got multiple databases in the cluster), it'd be a bit
problematic to have another layer of parallelism below that, if it did
indeed do anything.  You don't want "--jobs=10" to suddenly turn into
100 sessions.

            regards, tom lane

Re: pg_upgrade --jobs

From

senor

Date:

06 April 2019, 23:38:26

Thanks Tom. I suppose "pg_dump can only parallelize data dumping" answers my original question as "expected behavior"
butI would like to understand the reason better. 

My knowledge of Postgres and other DBMSs is at casual admin level with the occasional deep dive on specific errors or
analysis.I'm not averse to getting into the code. Before my OP I searched for reasons that the schema-only option would
preventpg_dump from being able to run multiple jobs and didn't find anything that I understood to confirm either way. 

Is the limitation simply the state of development to date or is there something about dumping the schemas that
conflictswith paralleling? I'm willing to do some studying if provided links to relevant articles. 

The --link option to pg_upgrade would be so much more useful if it weren't still bound to serially dumping the schemas
ofhalf a million tables. As already mentioned, if there is an alternate process that mimics pg_upgrade but allows for
paralleling,I'm open to that. 

Thanks all

________________________________________
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Saturday, April 6, 2019 3:02 PM
To: senor
Cc: pgsql-general@lists.postgresql.org
Subject: Re: pg_upgrade --jobs

senor <frio_cervesa@hotmail.com> writes:
> Since pg_upgrade is in control of how it is calling pg_dump, is there a reason pg_upgrade cannot use the directory
outputformat when calling pg_dump? Is the schema-only operation incompatible? 

Well, there's no point in it.  pg_dump can only parallelize data dumping,
and there's none to be done in the --schema-only case that pg_upgrade
uses.

Also, since pg_upgrade *does* use parallelism across multiple pg_dump
calls (if you've got multiple databases in the cluster), it'd be a bit
problematic to have another layer of parallelism below that, if it did
indeed do anything.  You don't want "--jobs=10" to suddenly turn into
100 sessions.

                        regards, tom lane

Re: pg_upgrade --jobs

From

Tom Lane

Date:

06 April 2019, 23:50:58

senor <frio_cervesa@hotmail.com> writes:
> Is the limitation simply the state of development to date or is there
> something about dumping the schemas that conflicts with paralleling?

At minimum, it'd take a complete redesign of pg_dump's output format,
and I'm not even very sure what such a redesign would look like.  All
the schema information goes into a single file that has to be written
serially.  Trying to make it be one file per table definition wouldn't
really fix much: somewhere there has to be a "table of contents", plus
where are you going to put the dependency info that shows what ordering
is required for restore?

> The --link option to pg_upgrade would be so much more useful if it
> weren't still bound to serially dumping the schemas of half a million
> tables.

To be perfectly blunt, if you've got a database with half a million
tables, You're Doing It Wrong.

            regards, tom lane

Re: pg_upgrade --jobs

From

Ron

Date:

06 April 2019, 23:57:31

On 4/6/19 6:50 PM, Tom Lane wrote:

senor <frio_cervesa@hotmail.com> writes:

[snip]

The --link option to pg_upgrade would be so much more useful if it
weren't still bound to serially dumping the schemas of half a million
tables.

To be perfectly blunt, if you've got a database with half a million
tables, You're Doing It Wrong.

Heavy (really heavy) partitioning?

--
Angular momentum makes the world go 'round.

Re: pg_upgrade --jobs

From

senor

Date:

07 April 2019, 00:47:59

Thanks Tom for the explanation. I assumed it was my ignorance of how the schema was handled that was making this look
likea problem that had already been solved and I was missing something. 

I fully expected the "You're Doing It Wrong" part. That is out of my control but not beyond my influence.

I suspect I know the answer to this but have to ask. Using a simplified example where there are 100K sets of 4 tables,
eachrepresenting the output of a single job, are there any shortcuts to upgrading that would circumvent exporting the
entireschema? I'm sure a different DB design would be better but that's not what I'm working with. 

Thanks

________________________________________
From: Ron <ronljohnsonjr@gmail.com>
Sent: Saturday, April 6, 2019 4:57 PM
To: pgsql-general@lists.postgresql.org
Subject: Re: pg_upgrade --jobs

On 4/6/19 6:50 PM, Tom Lane wrote:

senor <frio_cervesa@hotmail.com><mailto:frio_cervesa@hotmail.com> writes:


[snip]

The --link option to pg_upgrade would be so much more useful if it
weren't still bound to serially dumping the schemas of half a million
tables.



To be perfectly blunt, if you've got a database with half a million
tables, You're Doing It Wrong.

Heavy (really heavy) partitioning?

--
Angular momentum makes the world go 'round.

Re: pg_upgrade --jobs

From

Sherrylyn Branchaw

Date:

07 April 2019, 13:43:30

are there any shortcuts to upgrading that would circumvent exporting the entire schema?

By "shortcuts," do you mean you want to minimize the time and energy you put into the upgrade, or that you want to minimize database downtime? If you mean downtime, I was able to upgrade a customer-facing database with ~350,000 tables from Postgres 9.0 to 9.6 last year with only 86 seconds of downtime, using Slony, but I had to make many custom modifications to Slony and test thoroughly beforehand, and it was not for the faint of heart, the pressed for time, or the inexperienced. There may be better ways (and if so, I would be curious to learn about them), but Slony was the tool with which I was most familiar at the time.

This method does, of course, require exporting the entire schema, but because our only constraint was to minimize customer downtime, and the database was online while the schema was being exported, we didn't care how long it took. Your constraints may be different.

For those reading: we do know that 350,000 tables is Doing It Wrong, and we're getting rid of them, but we decided being on an EOLed version of Postgres was worse and should be fixed first.

Sherrylyn

Re: pg_upgrade --jobs

From

Adrian Klaver

Date:

07 April 2019, 15:19:23

On 4/6/19 5:47 PM, senor wrote:
> Thanks Tom for the explanation. I assumed it was my ignorance of how the schema was handled that was making this look
likea problem that had already been solved and I was missing something.
 
> 
> I fully expected the "You're Doing It Wrong" part. That is out of my control but not beyond my influence.
> 
> I suspect I know the answer to this but have to ask. Using a simplified example where there are 100K sets of 4
tables,each representing the output of a single job, are there any shortcuts to upgrading that would circumvent
exportingthe entire schema? I'm sure a different DB design would be better but that's not what I'm working with.
 

An answer is going to depend on more information:

1) What is the time frame for moving from one version to another?
Both the setup and the actual downtime.

2) There are 500,000+ tables, but what is the amount of data involved?

3) Are all the tables active?

4) How are the tables distributed across databases in the cluster and 
schemas in each database?


> 
> Thanks
> 
> ________________________________________
> From: Ron <ronljohnsonjr@gmail.com>
> Sent: Saturday, April 6, 2019 4:57 PM
> To: pgsql-general@lists.postgresql.org
> Subject: Re: pg_upgrade --jobs
> 
> On 4/6/19 6:50 PM, Tom Lane wrote:
> 
> senor <frio_cervesa@hotmail.com><mailto:frio_cervesa@hotmail.com> writes:
> 
> 
> [snip]
> 
> The --link option to pg_upgrade would be so much more useful if it
> weren't still bound to serially dumping the schemas of half a million
> tables.
> 
> 
> 
> To be perfectly blunt, if you've got a database with half a million
> tables, You're Doing It Wrong.
> 
> Heavy (really heavy) partitioning?
> 
> --
> Angular momentum makes the world go 'round.
> 
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: pg_upgrade --jobs

From

senor

Date:

07 April 2019, 19:05:37

Thank you Adrian. I'm not sure if I can provide as much as you'd need for a definite answer but I'll give you what I
have.

The original scheduled downtime for one installation was 24 hours. By 21 hours it had not completed the pg_dump
schema-onlyso it was returned to operation. 
The amount of data per table is widely varied. Some daily tables are 100-200GB and thousands of reports tables with
statsare much smaller. I'm not connected to check now but I'd guess 1GB max. We chose to use the --link option partly
becausesome servers do not have the disk space to copy. The time necessary to copy 1-2TB was also going to be an issue. 
The vast majority of activity is on current day inserts and stats reports of that data. All previous days and existing
reportsare read only. 
As is all too common, the DB usage grew with no redesign so it is a single database on a single machine with a single
schema.
I get the impression there may be an option of getting the schema dump while in service but possibly not in this
scenario.Plan B is to drop a lot of tables and deal with imports later. 

I appreciate the help.

________________________________________
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Sunday, April 7, 2019 8:19 AM
To: senor; pgsql-general@lists.postgresql.org
Subject: Re: pg_upgrade --jobs

On 4/6/19 5:47 PM, senor wrote:
> Thanks Tom for the explanation. I assumed it was my ignorance of how the schema was handled that was making this look
likea problem that had already been solved and I was missing something. 
>
> I fully expected the "You're Doing It Wrong" part. That is out of my control but not beyond my influence.
>
> I suspect I know the answer to this but have to ask. Using a simplified example where there are 100K sets of 4
tables,each representing the output of a single job, are there any shortcuts to upgrading that would circumvent
exportingthe entire schema? I'm sure a different DB design would be better but that's not what I'm working with. 

An answer is going to depend on more information:

1) What is the time frame for moving from one version to another?
Both the setup and the actual downtime.

2) There are 500,000+ tables, but what is the amount of data involved?

3) Are all the tables active?

4) How are the tables distributed across databases in the cluster and
schemas in each database?

>
> Thanks
>
> ________________________________________
> From: Ron <ronljohnsonjr@gmail.com>
> Sent: Saturday, April 6, 2019 4:57 PM
> To: pgsql-general@lists.postgresql.org
> Subject: Re: pg_upgrade --jobs
>
> On 4/6/19 6:50 PM, Tom Lane wrote:
>
> senor <frio_cervesa@hotmail.com><mailto:frio_cervesa@hotmail.com> writes:
>
>
> [snip]
>
> The --link option to pg_upgrade would be so much more useful if it
> weren't still bound to serially dumping the schemas of half a million
> tables.
>
>
>
> To be perfectly blunt, if you've got a database with half a million
> tables, You're Doing It Wrong.
>
> Heavy (really heavy) partitioning?
>
> --
> Angular momentum makes the world go 'round.
>
>
>

--
Adrian Klaver
adrian.klaver@aklaver.com

Re: pg_upgrade --jobs

From

senor

Date:

07 April 2019, 19:18:32

I just noticed I missed Sherrylyn's post.
I did some reading about Slony and believe it is would be useful if I had the time to dig in. As pointed out, it's not
anout-of-the box solution. It is included on the TODO list though. For now I can only dream of the 86 second down time. 

Thanks

________________________________________
From: Sherrylyn Branchaw <sbranchaw@gmail.com>
Sent: Sunday, April 7, 2019 6:43 AM
To: senor
Cc: pgsql-general@lists.postgresql.org
Subject: Re: pg_upgrade --jobs

are there any shortcuts to upgrading that would circumvent exporting the entire schema?

By "shortcuts," do you mean you want to minimize the time and energy you put into the upgrade, or that you want to
minimizedatabase downtime? If you mean downtime, I was able to upgrade a customer-facing database with ~350,000 tables
fromPostgres 9.0 to 9.6 last year with only 86 seconds of downtime, using Slony, but I had to make many custom
modificationsto Slony and test thoroughly beforehand, and it was not for the faint of heart, the pressed for time, or
theinexperienced. There may be better ways (and if so, I would be curious to learn about them), but Slony was the tool
withwhich I was most familiar at the time. 

This method does, of course, require exporting the entire schema, but because our only constraint was to minimize
customerdowntime, and the database was online while the schema was being exported, we didn't care how long it took.
Yourconstraints may be different. 

For those reading: we do know that 350,000 tables is Doing It Wrong, and we're getting rid of them, but we decided
beingon an EOLed version of Postgres was worse and should be fixed first. 

Sherrylyn

Re: pg_upgrade --jobs

From

Adrian Klaver

Date:

07 April 2019, 19:35:56

On 4/7/19 12:05 PM, senor wrote:
> Thank you Adrian. I'm not sure if I can provide as much as you'd need for a definite answer but I'll give you what I
have.
> 
> The original scheduled downtime for one installation was 24 hours. By 21 hours it had not completed the pg_dump
schema-onlyso it was returned to operation.
 

So this is more then one cluster?

I am assuming the below was repeated at different sites?

> The amount of data per table is widely varied. Some daily tables are 100-200GB and thousands of reports tables with
statsare much smaller. I'm not connected to check now but I'd guess 1GB max. We chose to use the --link option partly
becausesome servers do not have the disk space to copy. The time necessary to copy 1-2TB was also going to be an
issue.
> The vast majority of activity is on current day inserts and stats reports of that data. All previous days and
existingreports are read only.
 
> As is all too common, the DB usage grew with no redesign so it is a single database on a single machine with a single
schema.
> I get the impression there may be an option of getting the schema dump while in service but possibly not in this
scenario.Plan B is to drop a lot of tables and deal with imports later.
 

I take the above to mean that a lot of the tables are cruft, correct?

> 
> I appreciate the help.
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: pg_upgrade --jobs

From

Melvin Davidson

Date:

07 April 2019, 20:09:00

> The original scheduled downtime for one installation was 24 hours. By 21 hours it had not >completed the pg_dump schema-only so it was returned to operation.

To me, your best option is to create a slony cluster with the version you need to upgrade to.
When slony is in sync, simply make it the master and switch to it. It may take a while for
slony replication to be in sync, but when it is, there will be very little down time to switch
over.

Slony

On Sun, Apr 7, 2019 at 3:36 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 4/7/19 12:05 PM, senor wrote:
> Thank you Adrian. I'm not sure if I can provide as much as you'd need for a definite answer but I'll give you what I have.
>
> The original scheduled downtime for one installation was 24 hours. By 21 hours it had not completed the pg_dump schema-only so it was returned to operation.

So this is more then one cluster?

I am assuming the below was repeated at different sites?

> The amount of data per table is widely varied. Some daily tables are 100-200GB and thousands of reports tables with stats are much smaller. I'm not connected to check now but I'd guess 1GB max. We chose to use the --link option partly because some servers do not have the disk space to copy. The time necessary to copy 1-2TB was also going to be an issue.
> The vast majority of activity is on current day inserts and stats reports of that data. All previous days and existing reports are read only.
> As is all too common, the DB usage grew with no redesign so it is a single database on a single machine with a single schema.
> I get the impression there may be an option of getting the schema dump while in service but possibly not in this scenario. Plan B is to drop a lot of tables and deal with imports later.

I take the above to mean that a lot of the tables are cruft, correct?

>
> I appreciate the help.
>

--
Adrian Klaver
adrian.klaver@aklaver.com

Melvin Davidson
Maj. Database & Exploration Specialist
Universe Exploration Command – UXC
Employment by invitation only!

Re: pg_upgrade --jobs

From

Sherrylyn Branchaw

Date:

07 April 2019, 20:49:20

> It may take a while for slony replication to be in sync, but when it is, there will be very little down time to switch over.

I agree in principle, which is why I chose Slony over pg_upgrade for my company's very similar situation, but my experience was that, out of the box, Slony was projected to take unacceptably long (months) to sync our 350,000 tables, and downtime was going to be many hours. In order to get those numbers down, I had to reduce the number of per-table steps Slony was executing, e.g. by eliminating unnecessary-for-us ones and by rewriting others to happen in bulk.

Here's something I didn't know Slony did when I didn't have 350,000 tables to worry about: add a table to replication, run through the existing list of replicated tables to make sure it's captured any changes that have happened in the meantime, add one more table to replication, run through the existing list of tables to make sure no changes have happened, and so on. The more tables you add, the longer it takes to add the next table. Here's another thing I didn't know it did: during the switchover, manage 4 triggers per table serially on primary and standby. 4 * 350000 * 2 = 2.8 million triggers. (I knew it managed 4 triggers per table, but not that it was serial or how long that would take when pg_trigger had almost 3 million relevant rows.)

I would love to help the OP out in a more hands-on way (I have upgrade-via-Slony consulting experience), as well as to open source the custom changes I came up with, but I'm debating whether I have the time to offer to take on another project right now. I'm also reluctant to summarize exactly what I did, because messing with pg_catalog directly is very delicate and likely to go wrong, and I don't recommend it to the inexperienced all, or to the experienced if they have alternatives.

> Plan B is to drop a lot of tables and deal with imports later.

If it were me, I would take a copy of my database, restore it to a sandbox environment, set up Slony, and get an estimate for the projected sync time. Let it run for a few hours to see how dramatically the sync time slows down as the number of tables grows. I.e. don't count the number of replicated tables after an hour and assume that's your hourly rate of syncing. If it looks like you can sync your entire database in a few days or less, then let it run and test out your failover time. If that's acceptable, you're good to go.

If sync time looks like it's going to be incredibly long on your schema + hardware + Postgres version, etc., then failover time probably would be too. In that case, temporarily dropping the tables you can drop may be preferable to the complexity of making changes to speed up Slony, if you don't have a seriously experienced DBA on hand.

Sherrylyn