Thread: pg_upgrade hangs
Hi all, My QA folks are doing an upgrade test, migrating a database from 9.2.2 to 9.3.3. By default on CentOS, the databases liveat: /var/lib/pgsql/9.2/data & /var/lib/pgsql/9.3/data But, they've decided to move these to /data/postgres-9.2/data and /data/postgres-9.3/data and use symlinks back to the originallocations, so they don't have to modify the /etc/init.d scripts. Now, pg_upgrade is just hangs regardless of whether I use the actual physical locations or the symlinkedones. I haven't yet tried attaching "strace" to the process, but is there any reason to believe that the symlinks might be causing the problem? Or do I need to look at somethingelse? -- Jay
On Wed, Aug 13, 2014 at 06:20:38PM -0400, John Scalia wrote: > Hi all, > > My QA folks are doing an upgrade test, migrating a database from 9.2.2 to 9.3.3. By default on CentOS, the databases liveat: > > /var/lib/pgsql/9.2/data & /var/lib/pgsql/9.3/data > > But, they've decided to move these to /data/postgres-9.2/data and > /data/postgres-9.3/data and use symlinks back to the original > locations, so they don't have to modify the /etc/init.d scripts. > Now, pg_upgrade is just hangs regardless of whether I use the actual > physical locations or the symlinked ones. I haven't yet tried > attaching "strace" to the process, but is there any reason to > believe that the symlinks might be causing the problem? Or do I need > to look at something else? I have no idea. I have never heard of that, but I suggest you figure out why the hang is happening independent of pg_upgrade. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Well, it's now gotten much worse, and I really do not understand how this is failing. I started off this morning trying torun pg_grade with strace in the hope that I could find something hanging up, but pg-upgrade is now failing and exiting very early. Almost immediately after reporting "Checkingcluster versions ok", it throws a SQL command failed where it tried to create the temporary table 'info_rels'. The error is that type "info_rels" already exists. My first thought was that a previous attempt had partially succeeded and left some garbage in the new 9.3 instance. So, Ideleted everything in the 9.3 data directory and did a new initdb there, but the problem continues to reoccur. I have since checked the old 9.2 database, including the template1 portion,but I don't see an info_rels table in any of these. (I checked using "select * from pg_tables;") I'd really like to know why it's failing here. Where is it finding this info_relstable and why can't I find it? Is it deleting it after it spits out this error? And is this table something that gets repeatedly created and dropped during a pg_upgrade? Very confusing, Jay On 8/13/2014 10:48 PM, Bruce Momjian wrote: > On Wed, Aug 13, 2014 at 06:20:38PM -0400, John Scalia wrote: >> Hi all, >> >> My QA folks are doing an upgrade test, migrating a database from 9.2.2 to 9.3.3. By default on CentOS, the databases liveat: >> >> /var/lib/pgsql/9.2/data & /var/lib/pgsql/9.3/data >> >> But, they've decided to move these to /data/postgres-9.2/data and >> /data/postgres-9.3/data and use symlinks back to the original >> locations, so they don't have to modify the /etc/init.d scripts. >> Now, pg_upgrade is just hangs regardless of whether I use the actual >> physical locations or the symlinked ones. I haven't yet tried >> attaching "strace" to the process, but is there any reason to >> believe that the symlinks might be causing the problem? Or do I need >> to look at something else? > I have no idea. I have never heard of that, but I suggest you figure > out why the hang is happening independent of pg_upgrade. >
On Thu, Aug 14, 2014 at 09:30:13AM -0400, John Scalia wrote: > Well, it's now gotten much worse, and I really do not understand how > this is failing. I started off this morning trying to run pg_grade > with strace in the hope that I could find something hanging up, but > pg-upgrade is now failing and exiting very early. Almost immediately > after reporting "Checking cluster versions ok", it throws a SQL > command failed where it tried to create the temporary table > 'info_rels'. The error is that type "info_rels" already exists. > > My first thought was that a previous attempt had partially succeeded > and left some garbage in the new 9.3 instance. So, I deleted > everything in the 9.3 data directory and did a new initdb there, but > the problem continues to reoccur. I have since checked the old 9.2 > database, including the template1 portion, but I don't see an > info_rels table in any of these. (I checked using "select * from > pg_tables;") I'd really like to know why it's failing here. Where is > it finding this info_rels table and why can't I find it? Is it > deleting it after it spits out this error? And is this table > something that gets repeatedly created and dropped during a > pg_upgrade? Well, you managed to show me nothing, so I can't really help you further. info_rels is an internal C struture so I can't guess what is failing. I suggest you show the error, and read the message and then look in the log file it recommends, if any. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Well, I'd pull the log for you, but the team has taken the VM offline and is trying to replace it. I'll see when they getthe new one built if any of this starts to repeat. Sent from my iPad > On Aug 14, 2014, at 12:57 PM, Bruce Momjian <bruce@momjian.us> wrote: > >> On Thu, Aug 14, 2014 at 09:30:13AM -0400, John Scalia wrote: >> Well, it's now gotten much worse, and I really do not understand how >> this is failing. I started off this morning trying to run pg_grade >> with strace in the hope that I could find something hanging up, but >> pg-upgrade is now failing and exiting very early. Almost immediately >> after reporting "Checking cluster versions ok", it throws a SQL >> command failed where it tried to create the temporary table >> 'info_rels'. The error is that type "info_rels" already exists. >> >> My first thought was that a previous attempt had partially succeeded >> and left some garbage in the new 9.3 instance. So, I deleted >> everything in the 9.3 data directory and did a new initdb there, but >> the problem continues to reoccur. I have since checked the old 9.2 >> database, including the template1 portion, but I don't see an >> info_rels table in any of these. (I checked using "select * from >> pg_tables;") I'd really like to know why it's failing here. Where is >> it finding this info_rels table and why can't I find it? Is it >> deleting it after it spits out this error? And is this table >> something that gets repeatedly created and dropped during a >> pg_upgrade? > > Well, you managed to show me nothing, so I can't really help you > further. info_rels is an internal C struture so I can't guess what is > failing. I suggest you show the error, and read the message and then > look in the log file it recommends, if any. > > -- > Bruce Momjian <bruce@momjian.us> http://momjian.us > EnterpriseDB http://enterprisedb.com > > + Everyone has their own god. +