Hang on NOTIFY - Mailing list pgsql-bugs
From | Mark Simonetti |
---|---|
Subject | Hang on NOTIFY |
Date | |
Msg-id | 55C49756.70505@opalsoftware.co.uk Whole thread Raw |
Responses |
Re: Hang on NOTIFY
|
List | pgsql-bugs |
The system I am developing makes extensive use of the async NOTIFY/LISTEN system. I am currently experiencing a problem on 2 production servers: Server 1: Virtual Windows Server 2008 R2 (VMWare) PostgreSQL 9.3.5 Server 2: Virtual Windows Server 2008 R2 (VMWare) PostgreSQL 9.4.2 After the system has been running for a period of time, sometimes a few days sometimes a few weeks, any calls to NOTIFY will hang. After in depth investigation it appears to happen when a listening backend has been connected for some time (days). Any other backend trying to inform that backend will hang on "CallNamedPipe" in pgkill (kill.c). Here is a stack trace from the hung SENDING backend, main thread : - ntdll.dll!_NtFsControlFile@40() + 0x15 bytes ntdll.dll!_NtFsControlFile@40() + 0x15 bytes kernel32.dll!_CallNamedPipeW@28() + 0xf4 bytes postgres.exe!pgkill(int pid, int sig) Line 43 + 0x2b bytes C postgres.exe!SendProcSignal(int pid, ProcSignalReason reason, int backendId) Line 198 + 0x10 bytes C postgres.exe!SignalBackends() Line 1497 + 0xe bytes C > postgres.exe!ProcessCompletedNotifies() Line 1092 C postgres.exe!PostgresMain(int argc, char * * argv, const char * dbname, const char * username) Line 3947 C postgres.exe!BackendRun(Port * port) Line 4011 + 0x21 bytes C postgres.exe!SubPostmasterMain(int argc, char * * argv) Line 4515 + 0x8 bytes C postgres.exe!main(int argc, char * * argv) Line 203 + 0x7 bytes C postgres.exe!__tmainCRTStartup() Line 555 + 0x17 bytes C kernel32.dll!@BaseThreadInitThunk@12() + 0x12 bytes ntdll.dll!___RtlUserThreadStart@8() + 0x27 bytes ntdll.dll!__RtlUserThreadStart@8() + 0x1b bytes Here is a stack trace from the signalling thread (I know its irrelevent as this is for incomming signals) : - ntdll.dll!_NtFsControlFile@40() + 0x15 bytes ntdll.dll!_NtFsControlFile@40() + 0x15 bytes > postgres.exe!pg_signal_thread(void * param) Line 279 + 0x9 bytes C Now for the RECIPIENT backend : - ntdll.dll!_ZwWaitForMultipleObjects@20() + 0x15 bytes ntdll.dll!_ZwWaitForMultipleObjects@20() + 0x15 bytes KERNELBASE.dll!_WaitForMultipleObjectsEx@20() + 0x36 bytes kernel32.dll!_WaitForMultipleObjectsExImplementation@20() + 0x8e bytes > postgres.exe!pgwin32_waitforsinglesocket(unsigned int s, int what, int timeout) Line 216 + 0x14 bytes C postgres.exe!pgwin32_recv(unsigned int s, char * buf, int len, int f) Line 352 + 0xa bytes C postgres.exe!secure_read(Port * port, void * ptr, unsigned int len) Line 304 + 0x12 bytes C postgres.exe!pq_getbyte() Line 895 + 0x67 bytes C postgres.exe!SocketBackend(StringInfoData * inBuf) Line 344 + 0x5 bytes C postgres.exe!PostgresMain(int argc, char * * argv, const char * dbname, const char * username) Line 3968 + 0x1c bytes C postgres.exe!BackendRun(Port * port) Line 4011 + 0x21 bytes C postgres.exe!SubPostmasterMain(int argc, char * * argv) Line 4515 + 0x8 bytes C postgres.exe!main(int argc, char * * argv) Line 203 + 0x7 bytes C postgres.exe!__tmainCRTStartup() Line 555 + 0x17 bytes C kernel32.dll!@BaseThreadInitThunk@12() + 0x12 bytes ntdll.dll!___RtlUserThreadStart@8() + 0x27 bytes ntdll.dll!__RtlUserThreadStart@8() + 0x1b bytes This is the usual place for it to wait, so this seems okay. ntdll.dll!_NtFsControlFile@40() + 0x15 bytes ntdll.dll!_NtFsControlFile@40() + 0x15 bytes > postgres.exe!pg_signal_thread(void * param) Line 279 + 0x9 bytes C Also looks fine. This seems like a possible Windows bug, as the call to CallNamedPipe has a timeout of 1000 milliseconds, but it is clearly not timing out. It only seems to exit if I exit the backend it is trying to signal. NOTE: it is trying to send to many backends, but on all the stuck backends I checked, they all were stuck sending to the same recipient. Closing that particular recipient DOES free everything up and signals start flowing again. I've searched around and cannot find a similar bug report. Is it possibly something I'm doing wrong? Thanks, Mark. --
pgsql-bugs by date: