Thread: BUG #7815: Upgrading PostgreSQL from 9.1 to 9.2 with pg_upgrade/postgreql-setup fails - invalid status retrieve
BUG #7815: Upgrading PostgreSQL from 9.1 to 9.2 with pg_upgrade/postgreql-setup fails - invalid status retrieve
From
giomac@gmail.com
Date:
The following bug has been logged on the website: Bug reference: 7815 Logged by: George Machitidze Email address: giomac@gmail.com PostgreSQL version: 9.2.2 Operating system: Fedora 18 Linux Description: = https://bugzilla.redhat.com/show_bug.cgi?id=3D896161 Upgrading PostgreSQL from 9.1 to 9.2 with pg_upgrade/postgreql-setup fails with invalid message "There seems to be a postmaster servicing the old cluster". Looks like pg_upgrade is checking pid file too early without waiting for master process to exit: open("/var/lib/pgsql/data-old/postmaster.pid", O_RDONLY) =3D 5
Re: BUG #7815: Upgrading PostgreSQL from 9.1 to 9.2 with pg_upgrade/postgreql-setup fails - invalid status retrieve
From
Bruce Momjian
Date:
On Fri, Jan 18, 2013 at 10:19:48PM +0000, giomac@gmail.com wrote: > The following bug has been logged on the website: > > Bug reference: 7815 > Logged by: George Machitidze > Email address: giomac@gmail.com > PostgreSQL version: 9.2.2 > Operating system: Fedora 18 Linux > Description: > > https://bugzilla.redhat.com/show_bug.cgi?id=896161 > Upgrading PostgreSQL from 9.1 to 9.2 with pg_upgrade/postgreql-setup fails > with invalid message "There seems to be a postmaster servicing the old > cluster". Looks like pg_upgrade is checking pid file too early without > waiting for master process to exit: > > open("/var/lib/pgsql/data-old/postmaster.pid", O_RDONLY) = 5 How are you shutting down the postmaster? Are you use pg_ctl -w stop? If not, you have to wait for the server to actually shut down before starting pg_upgrade. pg_upgrade is not going to do that waiting. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Re: BUG #7815: Upgrading PostgreSQL from 9.1 to 9.2 with pg_upgrade/postgreql-setup fails - invalid status retrieve
From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes: > On Fri, Jan 18, 2013 at 10:19:48PM +0000, giomac@gmail.com wrote: >> https://bugzilla.redhat.com/show_bug.cgi?id=896161 >> Upgrading PostgreSQL from 9.1 to 9.2 with pg_upgrade/postgreql-setup fails >> with invalid message "There seems to be a postmaster servicing the old >> cluster". Looks like pg_upgrade is checking pid file too early without >> waiting for master process to exit: >> >> open("/var/lib/pgsql/data-old/postmaster.pid", O_RDONLY) = 5 > How are you shutting down the postmaster? Are you use pg_ctl -w stop? > If not, you have to wait for the server to actually shut down before > starting pg_upgrade. pg_upgrade is not going to do that waiting. The backstory on this is at the cited Red Hat bug ... apparently the OP decided I was clueless and he needed to consult some real authorities. The existing pg_control clearly says that the cluster was shut down, so it's not clear why there's still a postmaster.pid file there. There's some debugging to be done yet about how that got to be that way. (AFAICS the RPM upgrade process ought to shut down the old postmaster before installing a new one; but somehow that went wrong, or else a doppelganger postmaster.pid rose from the dead. Anyway, that's not a matter for this list because it involves Red Hat upgrade processes, not anything supplied by the community.) In the meantime, I was wondering a bit why pg_upgrade looks at the postmaster.pid file at all. Generally we recommend that startup scripts *not* look at the lock file but just try to start a postmaster, and leave it to the postmaster to decide if there's a valid lockfile present. Is it really appropriate for pg_upgrade to do this differently? I think the complained-of case would have gone through cleanly if that error check weren't there, or in any case the postmaster would have done a better job of checking for a conflicting postmaster. regards, tom lane
Re: BUG #7815: Upgrading PostgreSQL from 9.1 to 9.2 with pg_upgrade/postgreql-setup fails - invalid status retrieve
From
Bruce Momjian
Date:
On Sat, Jan 19, 2013 at 12:02:31AM -0500, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > On Fri, Jan 18, 2013 at 10:19:48PM +0000, giomac@gmail.com wrote: > >> https://bugzilla.redhat.com/show_bug.cgi?id=896161 > >> Upgrading PostgreSQL from 9.1 to 9.2 with pg_upgrade/postgreql-setup fails > >> with invalid message "There seems to be a postmaster servicing the old > >> cluster". Looks like pg_upgrade is checking pid file too early without > >> waiting for master process to exit: > >> > >> open("/var/lib/pgsql/data-old/postmaster.pid", O_RDONLY) = 5 > > > How are you shutting down the postmaster? Are you use pg_ctl -w stop? > > If not, you have to wait for the server to actually shut down before > > starting pg_upgrade. pg_upgrade is not going to do that waiting. > > The backstory on this is at the cited Red Hat bug ... apparently the OP > decided I was clueless and he needed to consult some real authorities. Yes, it was clear there was some backstory in reading that thread. > The existing pg_control clearly says that the cluster was shut down, > so it's not clear why there's still a postmaster.pid file there. > There's some debugging to be done yet about how that got to be that way. > (AFAICS the RPM upgrade process ought to shut down the old postmaster > before installing a new one; but somehow that went wrong, or else a > doppelganger postmaster.pid rose from the dead. Anyway, that's not a > matter for this list because it involves Red Hat upgrade processes, not > anything supplied by the community.) > > In the meantime, I was wondering a bit why pg_upgrade looks at the > postmaster.pid file at all. Generally we recommend that startup scripts > *not* look at the lock file but just try to start a postmaster, and > leave it to the postmaster to decide if there's a valid lockfile > present. Is it really appropriate for pg_upgrade to do this > differently? I think the complained-of case would have gone through > cleanly if that error check weren't there, or in any case the postmaster > would have done a better job of checking for a conflicting postmaster. The reason we check for postmaster.pid is so we can give the user a clue about which postmaster is running. We want to make sure everything is super-clean before we do anything. What we could do is to first try to start each cluster, and then fail if the start fails, but the start could fail for all sorts of reasons so it doesn't really seems like a win. Also, we don't want to start on a non-clean shutdown, so the missing pid file tells us it was clean. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Re: BUG #7815: Upgrading PostgreSQL from 9.1 to 9.2 with pg_upgrade/postgreql-setup fails - invalid status retrieve
From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes: > On Sat, Jan 19, 2013 at 12:02:31AM -0500, Tom Lane wrote: >> In the meantime, I was wondering a bit why pg_upgrade looks at the >> postmaster.pid file at all. > The reason we check for postmaster.pid is so we can give the user a clue > about which postmaster is running. [ scratches head... ] I failed to detect any such clue in the error message it prints. Had you printed the PID from the file, or even better looked to see if that process was actually still alive, this argument would be reasonable. But pg_upgrade does neither of those, whereas if it had started a postmaster the postmaster would have done both of those things. > Also, we don't want to start on a non-clean shutdown, so the missing pid > file tells us it was clean. I agree that super paranoia is not unreasonable in pg_upgrade. But it would be useful to print something similar to what the backend prints, about checking whether PID N is still there and manually removing the lock file if not. Or (ahem) you could let the existing backend-side logic do that for you, rather than reimplementing that logic badly. Meanwhile I still have to figure out how come the postmaster.pid file is still there in the OP's case ... regards, tom lane
Re: BUG #7815: Upgrading PostgreSQL from 9.1 to 9.2 with pg_upgrade/postgreql-setup fails - invalid status retrieve
From
Bruce Momjian
Date:
On Sat, Jan 19, 2013 at 12:47:03AM -0500, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > On Sat, Jan 19, 2013 at 12:02:31AM -0500, Tom Lane wrote: > >> In the meantime, I was wondering a bit why pg_upgrade looks at the > >> postmaster.pid file at all. > > > The reason we check for postmaster.pid is so we can give the user a clue > > about which postmaster is running. > > [ scratches head... ] I failed to detect any such clue in the error > message it prints. Had you printed the PID from the file, or even > better looked to see if that process was actually still alive, this > argument would be reasonable. But pg_upgrade does neither of those, > whereas if it had started a postmaster the postmaster would have done > both of those things. > > > Also, we don't want to start on a non-clean shutdown, so the missing pid > > file tells us it was clean. > > I agree that super paranoia is not unreasonable in pg_upgrade. But it > would be useful to print something similar to what the backend prints, > about checking whether PID N is still there and manually removing the > lock file if not. Or (ahem) you could let the existing backend-side > logic do that for you, rather than reimplementing that logic badly. The current output is: There seems to be a postmaster servicing the old cluster. Please shutdown that postmaster and try again. You are right that it is inaccurate. I should reword that to say the server is running or was not properly shut down: There seems to be a postmaster servicing the old cluster, or it was not properly shut down. Please cleanly shutdown that postmaster and try again. Why is a clean shutdown important? If the server crashed, we would have committed transactions in the WAL files which are not transfered to the new server, and would be lost. I am hesistant to even start such an old server because pg_upgrade never modifies the old server. Even starting it in that case would be modifying it. The other problem is that if the server start fails, how do we know if the failure was due to a running postmaster? I could later check the postmaster.pid file, but it might have failed not yet getting to the section where we remove that file. The server-still-running is a common cause of failure, so I wanted something that was very clear, rather than a generic can't-start-the-server. I am open to ideas. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Re: BUG #7815: Upgrading PostgreSQL from 9.1 to 9.2 with pg_upgrade/postgreql-setup fails - invalid status retrieve
From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes: > Why is a clean shutdown important? If the server crashed, we would have > committed transactions in the WAL files which are not transfered to the > new server, and would be lost. > I am hesistant to even start such an old server because pg_upgrade never > modifies the old server. Even starting it in that case would be > modifying it. I'm not really following this logic. If the old cluster was in a crashed state, why would we not expect that starting a postmaster would be the best (only) way to repair the damage and make everything good again? Isn't that exactly what the user would have to do anyway? What other action would you expect him to take instead? (But, at least with the type of packaging I'm using in Fedora, he would first have to go through a package downgrade/reinstallation process, because the packaging provides no simple scripted way of manually starting the old postgres executable, only the new one. Moreover, what pg_upgrade is printing provides no help in figuring out whether that's the next step.) I do sympathize with taking a paranoid attitude here, but I'm failing to see what advantage there is in not attempting to start the old postmaster. In the *only* case that pg_upgrade is successfully protecting against with this logic, namely there's-an-active-postmaster- already, the postmaster is equally able to protect itself. In other cases it would be more helpful not less to let the postmaster analyze the situation. > The other problem is that if the server start fails, how do we know if > the failure was due to a running postmaster? Because we read the postmaster's log file, or at least tell the user to do so. That report would be unambiguous, unlike pg_upgrade's. regards, tom lane
Re: BUG #7815: Upgrading PostgreSQL from 9.1 to 9.2 with pg_upgrade/postgreql-setup fails - invalid status retrieve
From
Bruce Momjian
Date:
On Sat, Jan 19, 2013 at 10:45:15PM +0400, George Machitidze wrote: > Hi Bruce, Tom > > >The backstory on this is at the cited Red Hat bug ... apparently the OP > >decided I was clueless and he needed to consult some real authorities. > Oh come on, I'm very sure you both are good guys and know what you are doing, > none of us is ignorant bastard :) > Decided to open case here too, because of simple reason - maybe someone had > same issue, or knows how pg_upgrade works (in details) better than me, because > I am clueless. > This is test DB and I can erase it, but I'm very sure there's something wrong > in upgrade process - this is what I want to be solved. > > Now, we can open a bottle of whiskey and go back to the problem: > 1. I didn't run postmaster before/during pg_upgrade, it was never invoked > manually in this process > 2. There is no pid file AFTER application is stopped, but looks like it's there > while pg_upgrade is running - strace showed that and there is no need to run > FAM to verify that > > I don't know how pg_upgrade works, looks like it's trying to start postmaster, > which runs, postmaster.pid is created, then postmaster fails stop or needs some > more time bedore pg_upgrade is checking it's pid. That's what I see. > > So, is pg_upgrade starting postmaster? If yes, then when (at which step) and > why pid file check is done. That's all what we all want to know, right? The pid check is done before pg_upgrade starts or stops any postmaster, to make sure both servers are down before it starts. Tom wants that testing improved. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Re: BUG #7815: Upgrading PostgreSQL from 9.1 to 9.2 with pg_upgrade/postgreql-setup fails - invalid status retrieve
From
Bruce Momjian
Date:
On Sat, Jan 19, 2013 at 11:27:28AM -0500, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > Why is a clean shutdown important? If the server crashed, we would have > > committed transactions in the WAL files which are not transfered to the > > new server, and would be lost. > > > I am hesistant to even start such an old server because pg_upgrade never > > modifies the old server. Even starting it in that case would be > > modifying it. > > I'm not really following this logic. If the old cluster was in a > crashed state, why would we not expect that starting a postmaster would > be the best (only) way to repair the damage and make everything good > again? Isn't that exactly what the user would have to do anyway? What > other action would you expect him to take instead? > > (But, at least with the type of packaging I'm using in Fedora, he would > first have to go through a package downgrade/reinstallation process, > because the packaging provides no simple scripted way of manually > starting the old postgres executable, only the new one. Moreover, what > pg_upgrade is printing provides no help in figuring out whether that's > the next step.) > > I do sympathize with taking a paranoid attitude here, but I'm failing > to see what advantage there is in not attempting to start the old > postmaster. In the *only* case that pg_upgrade is successfully > protecting against with this logic, namely there's-an-active-postmaster- > already, the postmaster is equally able to protect itself. In other > cases it would be more helpful not less to let the postmaster analyze > the situation. > > > The other problem is that if the server start fails, how do we know if > > the failure was due to a running postmaster? > > Because we read the postmaster's log file, or at least tell the user to > do so. That report would be unambiguous, unlike pg_upgrade's. Attached is a WIP patch to give you an idea of how I am going to solve this problem. This comment says it all: ! /* ! * If we have a postmaster.pid file, try to start the server. If ! * it starts, the pid file was stale, so stop the server. If it ! * doesn't start, assume the server is running. ! */ -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Attachment
Re: BUG #7815: Upgrading PostgreSQL from 9.1 to 9.2 with pg_upgrade/postgreql-setup fails - invalid status retrieve
From
George Machitidze
Date:
Hi Bruce, Tom >The backstory on this is at the cited Red Hat bug ... apparently the OP >decided I was clueless and he needed to consult some real authorities. Oh come on, I'm very sure you both are good guys and know what you are doing, none of us is ignorant bastard :) Decided to open case here too, because of simple reason - maybe someone had same issue, or knows how pg_upgrade works (in details) better than me, because I am clueless. This is test DB and I can erase it, but I'm very sure there's something wrong in upgrade process - this is what I want to be solved. Now, we can open a bottle of whiskey and go back to the problem: 1. I didn't run postmaster before/during pg_upgrade, it was never invoked manually in this process 2. There is no pid file AFTER application is stopped, but looks like it's there while pg_upgrade is running - strace showed that and there is no need to run FAM to verify that I don't know how pg_upgrade works, looks like it's trying to start postmaster, which runs, postmaster.pid is created, then postmaster fails stop or needs some more time bedore pg_upgrade is checking it's pid. That's what I see. So, is pg_upgrade starting postmaster? If yes, then when (at which step) and why pid file check is done. That's all what we all want to know, right? Best regards, George Machitidze On Sat, Jan 19, 2013 at 8:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Bruce Momjian <bruce@momjian.us> writes: > > Why is a clean shutdown important? If the server crashed, we would have > > committed transactions in the WAL files which are not transfered to the > > new server, and would be lost. > > > I am hesistant to even start such an old server because pg_upgrade never > > modifies the old server. Even starting it in that case would be > > modifying it. > > I'm not really following this logic. If the old cluster was in a > crashed state, why would we not expect that starting a postmaster would > be the best (only) way to repair the damage and make everything good > again? Isn't that exactly what the user would have to do anyway? What > other action would you expect him to take instead? > > (But, at least with the type of packaging I'm using in Fedora, he would > first have to go through a package downgrade/reinstallation process, > because the packaging provides no simple scripted way of manually > starting the old postgres executable, only the new one. Moreover, what > pg_upgrade is printing provides no help in figuring out whether that's > the next step.) > > I do sympathize with taking a paranoid attitude here, but I'm failing > to see what advantage there is in not attempting to start the old > postmaster. In the *only* case that pg_upgrade is successfully > protecting against with this logic, namely there's-an-active-postmaster- > already, the postmaster is equally able to protect itself. In other > cases it would be more helpful not less to let the postmaster analyze > the situation. > > > The other problem is that if the server start fails, how do we know if > > the failure was due to a running postmaster? > > Because we read the postmaster's log file, or at least tell the user to > do so. That report would be unambiguous, unlike pg_upgrade's. > > regards, tom lane >