Thread: intermittent issue with windows 7 service manager not able to correctly determine or control postgresql 9.4
intermittent issue with windows 7 service manager not able to correctly determine or control postgresql 9.4
I've got several machines running windows 7 which have postgresql 9.4 installed as a service, and configured to start automatically on boot. I am monitoring these services with zabbix and several times a week I get a notification that the postgresql-x64-9.4 service has stopped.
When I login to the machine, the service does appear to be stopped;

C:\Program Files\PostgreSQL\9.4>bin\psql.exe -U postgres -c "SELECT count(*) from media;" association
Password for user postgres:
count
---------
1167846
(1 row)
2016-04-30 05:03:13 BST FATAL: lock file "postmaster.pid" already exists
2016-04-30 05:03:13 BST HINT: Is another postmaster (PID 2556) running in data directory "C:/Program Files/PostgreSQL/9.4/data"?
The pg_ctl tool seems to correctly query the state of the service and return the correct PID;
C:\Program Files\PostgreSQL\9.4>bin\pg_ctl.exe -D "C:\Program Files\PostgreSQL\9.4\data" status
pg_ctl: server is running (PID: 2556)
The other thing that seems to happen is the pgadmin3 tool seems to have lost the ability to control the service as all the options for start/stop are greyed out;
The only option to get the control back is to kill the processes in the task manager or reboot the machine.
Any suggestions on what might be causing this?
Thanks,
Tom
Attachment
I have the same problem routinely on Windows 10.
The postgresql-x64-9.5 service shows up in Task Manager as Stopped, but is actually running just fine.
BTW pg_ctl does nothing – silently. The only way to restart the server is to kill off a process or two.
Regards
David M Bennett FACS
Andl - A New Database Language - andl.org
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Tom Hodder
Sent: Sunday, 1 May 2016 12:36 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] intermittent issue with windows 7 service manager not able to correctly determine or control postgresql 9.4
Hi All,
I've got several machines running windows 7 which have postgresql 9.4 installed as a service, and configured to start automatically on boot. I am monitoring these services with zabbix and several times a week I get a notification that the postgresql-x64-9.4 service has stopped.
When I login to the machine, the service does appear to be stopped;
However when I check the database, I can query it ok;
C:\Program Files\PostgreSQL\9.4>bin\psql.exe -U postgres -c "SELECT count(*) from media;" association
Password for user postgres:
count
---------
1167846
(1 row)
If I try to start the service from the service manager, I see the following error in the logs;
2016-04-30 05:03:13 BST FATAL: lock file "postmaster.pid" already exists
2016-04-30 05:03:13 BST HINT: Is another postmaster (PID 2556) running in data directory "C:/Program Files/PostgreSQL/9.4/data"?
The pg_ctl tool seems to correctly query the state of the service and return the correct PID;
C:\Program Files\PostgreSQL\9.4>bin\pg_ctl.exe -D "C:\Program Files\PostgreSQL\9.4\data" status
pg_ctl: server is running (PID: 2556)
The other thing that seems to happen is the pgadmin3 tool seems to have lost the ability to control the service as all the options for start/stop are greyed out;
The only option to get the control back is to kill the processes in the task manager or reboot the machine.
Any suggestions on what might be causing this?
Thanks,
Tom
Attachment
Re: intermittent issue with windows 7 service manager not able to correctly determine or control postgresql 9.4
Disclaimer: My comments here are generic to Windows services. I don't run Postgresql on Windows and I have no idea how it is implemented. On Sun, 1 May 2016 03:35:44 +0100, Tom Hodder <tom@limepepper.co.uk> wrote: >I've got several machines running windows 7 which have postgresql 9.4 >installed as a service, and configured to start automatically on boot. I am >monitoring these services with zabbix and several times a week I get a >notification that the postgresql-x64-9.4 service has stopped. > >When I login to the machine, the service does appear to be stopped; >? >However when I check the database, I can query it ok; Windows services have a time limit to respond to commands or status inquries. The service manager periodically queries status of all running services - if they don't respond quickly enough, the manager thinks they are hosed. That may or may not be true. But IME unresponsive services rarely appear "stopped" - usually they show as "started" in the service manager, or, if you run SC from the command line their state is shown as "running". >If I try to start the service from the service manager, I see the following >error in the logs; > >*2016-04-30 05:03:13 BST FATAL: lock file "postmaster.pid" already >exists2016-04-30 05:03:13 BST HINT: Is another postmaster (PID 2556) >running in data directory "C:/Program Files/PostgreSQL/9.4/data"?* > >The pg_ctl tool seems to correctly query the state of the service and >return the correct PID; > >*C:\Program Files\PostgreSQL\9.4>bin\pg_ctl.exe -D "C:\Program >Files\PostgreSQL\9.4\data" status >pg_ctl: server is running (PID: 2556**)* Which suggest the service either is not reponding to the manager's status inquiries, or is responding too late. >The other thing that seems to happen is the pgadmin3 tool seems to >have lost the ability to control the service as all the options for >start/stop are greyed out; >[image: Inline images 2] This is likely because the service manager believes the service is unresponsive. The programming API communicates with the manager. >The only option to get the control back is to kill the processes in >the task manager or reboot the machine. You could try "sc stop <service>" from the command line. The SC tool is separate from the shell "net" command and it sometimes will work when "net stop <service>" does not. You also could try using recovery options in the service manager to automatically restart the service. But if the service is showing as "stopped" when it really is running, this is unlikely to work. >Any suggestions on what might be causing this? Services are tricky to get right: there are a number of rules the control interface has to obey that are at odds with doing real work. A single threaded service must periodically send "busy" status to the manager during lengthy processing. Failure to do that in a timely manner will cause problems. A multi-threaded service that separates processing from control must be able to suspend or halt the processing when directed and send "busy" status if it can't. There is a way to launch arbirtrary programs as services so they can run at startup and in the background, but programs that weren't written explicitly to BE services don't obey the service manager and their diplayed status usually is bogus (provided by the launcher). George