Thread: 15 pg_upgrade with -j

15 pg_upgrade with -j

From

Jeff Ross

Date:

22 May 2023, 23:20:35

Hello!

We are moving from 10 to 15 and are in testing now.

Our development database is about 1400G and takes 12 minutes to complete a pg_upgrade with the -k (hard-links) version. This is on a CentOS 7 server with 80 cores.

Adding -j 40 to use half of those cores also finishes in 12 minutes and ps / top/ htop never show more than a single process at a time in use.

Bumping that to -j 80 to use them all also finishes in 12 minutes and still only a single process.

Running the suggested vacuum analyze after pg_upgrade completes takes about 19 minutes. Adding -j 40 takes that time down to around 5 minutes, jumps the server load up over 30 and htop shows 40 processes.

If -j 40 helps there--why not with pg_upgrade?

The full commands we are using for pg_upgrade are pretty stock:

time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B /usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k
time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B /usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k -j 40
time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B /usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k -j 80

Our production database is closer to 1900G. If we're looking at a 30 minute pg_upgrade window we'll be okay but if there is anything we can do to knock that time down we will and any suggestions to do so would be greatly appreciated.

Jeff Ross

Re: 15 pg_upgrade with -j

From

Adrian Klaver

Date:

22 May 2023, 23:24:36

On 5/22/23 16:20, Jeff Ross wrote:
> Hello!
> 
> We are moving from 10 to 15 and are in testing now.
> 
> Our development database is about 1400G and takes 12 minutes to complete 
> a pg_upgrade with the -k (hard-links) version.  This is on a CentOS 7 
> server with 80 cores.
> 
> Adding -j 40 to use half of those cores also finishes in 12 minutes and 
> ps / top/ htop never show more than a single process at a time in use.
> 
> Bumping that to -j 80 to use them all also finishes in 12 minutes and 
> still only a single process.
> 
> Running the suggested vacuum analyze after pg_upgrade completes takes 
> about 19 minutes.  Adding -j 40 takes that time down to around 5 
> minutes, jumps the server load up over 30 and htop shows 40 processes.
> 
> If -j 40 helps there--why not with pg_upgrade?

 From docs:

https://www.postgresql.org/docs/current/pgupgrade.html

The --jobs option allows multiple CPU cores to be used for 
copying/linking of files and to dump and restore database schemas in 
parallel; a good place to start is the maximum of the number of CPU 
cores and tablespaces. This option can dramatically reduce the time to 
upgrade a multi-database server running on a multiprocessor machine.

So is the 1400G mostly in one database in the cluster?

> 
> The full commands we are using for pg_upgrade are pretty stock:
> 
> time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B 
> /usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k
> time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B 
> /usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k -j 40
> time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B 
> /usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k -j 80
> 
> Our production database is closer to 1900G.  If we're looking at a 30 
> minute pg_upgrade window we'll be okay but if there is anything we can 
> do to knock that time down we will and any suggestions to do so would be 
> greatly appreciated.
> 
> Jeff Ross

-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: 15 pg_upgrade with -j

From

Jeff Ross

Date:

22 May 2023, 23:29:39

On 5/22/23 5:24 PM, Adrian Klaver wrote:

On 5/22/23 16:20, Jeff Ross wrote:
Hello!

We are moving from 10 to 15 and are in testing now.

Our development database is about 1400G and takes 12 minutes to complete a pg_upgrade with the -k (hard-links) version. This is on a CentOS 7 server with 80 cores.

Adding -j 40 to use half of those cores also finishes in 12 minutes and ps / top/ htop never show more than a single process at a time in use.

Bumping that to -j 80 to use them all also finishes in 12 minutes and still only a single process.

Running the suggested vacuum analyze after pg_upgrade completes takes about 19 minutes. Adding -j 40 takes that time down to around 5 minutes, jumps the server load up over 30 and htop shows 40 processes.

If -j 40 helps there--why not with pg_upgrade?

From docs:

https://www.postgresql.org/docs/current/pgupgrade.html

The --jobs option allows multiple CPU cores to be used for copying/linking of files and to dump and restore database schemas in parallel; a good place to start is the maximum of the number of CPU cores and tablespaces. This option can dramatically reduce the time to upgrade a multi-database server running on a multiprocessor machine.

So is the 1400G mostly in one database in the cluster?

The full commands we are using for pg_upgrade are pretty stock:

time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B /usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k
time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B /usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k -j 40
time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B /usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k -j 80

Our production database is closer to 1900G. If we're looking at a 30 minute pg_upgrade window we'll be okay but if there is anything we can do to knock that time down we will and any suggestions to do so would be greatly appreciated.

Jeff Ross

Yes, one big database with about 80 schemas and several other smaller databases so -j should help, right?

Jeff

Re: 15 pg_upgrade with -j

From

Tom Lane

Date:

22 May 2023, 23:42:14

Jeff Ross <jross@openvistas.net> writes:
> On 5/22/23 5:24 PM, Adrian Klaver wrote:
>> So is the 1400G mostly in one database in the cluster?

> Yes, one big database with about 80 schemas and several other smaller 
> databases so -j should help, right?

AFAICT from a quick look at the code, you won't get any meaningful
parallelism unless you have several large DBs and/or several large
tablespaces.  It looks like the assumption was that issuing link()
requests in parallel wouldn't help much but just swamp your disk
if they're all on the same filesystem.  Maybe that could use
rethinking, not sure.

            regards, tom lane

Re: 15 pg_upgrade with -j

From

Adrian Klaver

Date:

22 May 2023, 23:43:38

On 5/22/23 16:29, Jeff Ross wrote:
> On 5/22/23 5:24 PM, Adrian Klaver wrote:
>> On 5/22/23 16:20, Jeff Ross wrote:
>>> Hello!
>>>

>>
>> From docs:
>>
>> https://www.postgresql.org/docs/current/pgupgrade.html
>>
>> The --jobs option allows multiple CPU cores to be used for 
>> copying/linking of files and to dump and restore database schemas in 
>> parallel; a good place to start is the maximum of the number of CPU 
>> cores and tablespaces. This option can dramatically reduce the time to 
>> upgrade a multi-database server running on a multiprocessor machine.
>>
>> So is the 1400G mostly in one database in the cluster?
>>
>>>
>>> The full commands we are using for pg_upgrade are pretty stock:

> Yes, one big database with about 80 schemas and several other smaller 
> databases so -j should help, right?


As I understand it no. That the parallelism is between databases not 
within a database. Further that 'database schemas' refers to schema as 
the overall database object definitions not the namespaces known as 
schemas in the database.

> '
> Jeff

-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: 15 pg_upgrade with -j

From

Ron

Date:

23 May 2023, 02:10:48

On 5/22/23 18:42, Tom Lane wrote:
> Jeff Ross <jross@openvistas.net> writes:
>> On 5/22/23 5:24 PM, Adrian Klaver wrote:
>>> So is the 1400G mostly in one database in the cluster?
>>>
>> Yes, one big database with about 80 schemas and several other smaller
>> databases so -j should help, right?
> AFAICT from a quick look at the code, you won't get any meaningful
> parallelism unless you have several large DBs and/or several large
> tablespaces.

Hmm.  I'm glad I'm reading this now.

> It looks like the assumption was that issuing link()
> requests in parallel wouldn't help much but just swamp your disk
> if they're all on the same filesystem.
> Maybe that could use rethinking, not sure.

It does need rethinking in the era of VMs and SANs. /var/lib/pgsql/15 is 
going to be on a different LUN from /var/lib/pgsql/9.6 just like 
/var/lib/pgsql/backups is on a different LUN.

-- 
Born in Arizona, moved to Babylonia.

Re: 15 pg_upgrade with -j

From

Jeff Ross

Date:

23 May 2023, 14:38:53

On 5/22/23 5:42 PM, Tom Lane wrote:

Jeff Ross <jross@openvistas.net> writes:

On 5/22/23 5:24 PM, Adrian Klaver wrote:

So is the 1400G mostly in one database in the cluster?

Yes, one big database with about 80 schemas and several other smaller 
databases so -j should help, right?

AFAICT from a quick look at the code, you won't get any meaningful
parallelism unless you have several large DBs and/or several large
tablespaces.  It looks like the assumption was that issuing link()
requests in parallel wouldn't help much but just swamp your disk
if they're all on the same filesystem.  Maybe that could use
rethinking, not sure.
			regards, tom lane

Thanks Tom. These are all smokingly fast SSDs so it would be interesting to see how well they'd hold up under that load.

Jeff

Re: 15 pg_upgrade with -j

From

Jeff Ross

Date:

23 May 2023, 14:43:54

On 5/22/23 5:43 PM, Adrian Klaver wrote:

From docs:

https://www.postgresql.org/docs/current/pgupgrade.html

The --jobs option allows multiple CPU cores to be used for copying/linking of files and to dump and restore database schemas in parallel; a good place to start is the maximum of the number of CPU cores and tablespaces. This option can dramatically reduce the time to upgrade a multi-database server running on a multiprocessor machine.

So is the 1400G mostly in one database in the cluster?

The full commands we are using for pg_upgrade are pretty stock:

Yes, one big database with about 80 schemas and several other smaller databases so -j should help, right?

As I understand it no. That the parallelism is between databases not within a database. Further that 'database schemas' refers to schema as the overall database object definitions not the namespaces known as schemas in the database.

Thanks Adrian. That "restore database schemas in parallel" phrase seems like it would be really easy to read like we did and expect it to work with one database and multiple schemas.

Maybe it should be changed to "restore multiple databases in parallel" instead?

Jeff

Re: 15 pg_upgrade with -j

From

"Peter J. Holzer"

Date:

23 May 2023, 17:19:29

On 2023-05-22 21:10:48 -0500, Ron wrote:
> On 5/22/23 18:42, Tom Lane wrote:
> > It looks like the assumption was that issuing link()
> > requests in parallel wouldn't help much but just swamp your disk
> > if they're all on the same filesystem.
> > Maybe that could use rethinking, not sure.
>
> It does need rethinking in the era of VMs and SANs. /var/lib/pgsql/15 is
> going to be on a different LUN from /var/lib/pgsql/9.6

You can't hardlink between different file systems.

Even if you could assign single directories to specific LUNs (does any
file system allow this?) this would at best spread the updates across
two LUNs (the inodes would presumable stay on the source LUN and the
target directory would be on the target LUN).

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment

signature.asc

Re: 15 pg_upgrade with -j

From

Ron

Date:

23 May 2023, 18:17:24

On 5/23/23 12:19, Peter J. Holzer wrote:
> On 2023-05-22 21:10:48 -0500, Ron wrote:
>> On 5/22/23 18:42, Tom Lane wrote:
>>> It looks like the assumption was that issuing link()
>>> requests in parallel wouldn't help much but just swamp your disk
>>> if they're all on the same filesystem.
>>> Maybe that could use rethinking, not sure.
>> It does need rethinking in the era of VMs and SANs. /var/lib/pgsql/15 is
>> going to be on a different LUN from /var/lib/pgsql/9.6
> You can't hardlink between different file systems.

We'd never hardlink.  Eliminates the ability to return to the old system if 
something goes wrong.

-- 
Born in Arizona, moved to Babylonia.

Re: 15 pg_upgrade with -j

From

Christoph Moench-Tegeder

Date:

23 May 2023, 18:58:23

## Ron (ronljohnsonjr@gmail.com):

> We'd never hardlink.  Eliminates the ability to return to the old
> system if something goes wrong.

That's why you get yourself a recent XFS and use clone mode (still
sticks you to the same filesystem, but gets you up running much
faster).

Regards,
Christoph

-- 
Spare Space

Re: 15 pg_upgrade with -j

From

"Peter J. Holzer"

Date:

23 May 2023, 20:33:05

On 2023-05-23 13:17:24 -0500, Ron wrote:
> On 5/23/23 12:19, Peter J. Holzer wrote:
> > On 2023-05-22 21:10:48 -0500, Ron wrote:
> > > On 5/22/23 18:42, Tom Lane wrote:
> > > > It looks like the assumption was that issuing link()
                                                      ^^^^^^
> > > > requests in parallel wouldn't help much but just swamp your disk
> > > > if they're all on the same filesystem.
> > > > Maybe that could use rethinking, not sure.
> > > It does need rethinking in the era of VMs and SANs. /var/lib/pgsql/15 is
> > > going to be on a different LUN from /var/lib/pgsql/9.6
> > You can't hardlink between different file systems.
>
> We'd never hardlink.

But that was what Jeff and Tom were talking about. If you are changing
the subject you should at least make it explicit.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment

signature.asc

Re: 15 pg_upgrade with -j

From

Ron

Date:

23 May 2023, 22:08:00

On 5/23/23 13:58, Christoph Moench-Tegeder wrote:

## Ron (ronljohnsonjr@gmail.com):

We'd never hardlink.  Eliminates the ability to return to the old
system if something goes wrong.

That's why you get yourself a recent XFS and use clone mode (still
sticks you to the same filesystem, but gets you up running much
faster).

Sadly, our 9.6 systems (the ones we really need to get off of) are RHEL 6.10 and ext4.

--
Born in Arizona, moved to Babylonia.

Re: 15 pg_upgrade with -j

From

Alvaro Herrera

Date:

24 May 2023, 17:34:17

On 2023-May-23, Ron wrote:

> We'd never hardlink.  Eliminates the ability to return to the old system if
> something goes wrong.

If you'd never hardlink, then you should run your test without the -k
option.  Otherwise, the timings are meaningless.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"La vida es para el que se aventura"