Re: Backbranch releases and Win32 locking - Mailing list pgsql-hackers
From | Teodor Sigaev |
---|---|
Subject | Re: Backbranch releases and Win32 locking |
Date | |
Msg-id | 452A3AF1.7090005@sigaev.ru Whole thread Raw |
In response to | Re: Backbranch releases (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Backbranch releases and Win32 locking
|
List | pgsql-hackers |
Analyzing locking state, lock occurs when backend wants to send data to stat collector. So state is: backend waits FD_WRITE event, stat collector waits FD_READ. I suspect follow sequence of events in backend: 0 Let us work only with one socket, and socket associated with statically defined event object in pgwin32_waitforsinglesocket. 1. pgwin32_send:WSASend fails with WSAEWOULDBLOCK ( or its equivalent ) 2. socket s becomes writable and Windows signals event defined statically in pgwin32_waitforsinglesocket. 3. pgwin32_waitforsinglesocket(): ResetEvent resets event 4. pgwin32_waitforsinglesocket(): WaitForMultipleObjectsEx waits indefinitely... If I'm right, it's needed to move ResetEvent after WaitForMultipleObjectsEx. But comment in pgwin32_select() says that we should send something before test socket for FD_WRITE. pgwin32_send calls WSASend before pgwin32_waitforsinglesocket(), but there is a call of pgwin32_waitforsinglesocket in libpq/be-secure.c. So, attached patch adds call of WSASend with void buffer. It's a pity, but locking problem occurs only on SMP box and requires several hours to reproduce. So we are in testing now. What are opinions? PS Backtraces backend: ntdll.dll!KiFastSystemCallRet postgres.exe!pgwin32_waitforsinglesocket+0x197 postgres.exe!pgwin32_send+0xaf postgres.exe!pgstat_report_waiting+0x1bd postgres.exe!pgstat_report_tabstat+0xda postgres.exe!PostgresMain+0x1040 postgres.exe!ClosePostmasterPorts+0x1bce postgres.exe!SubPostmasterMain+0x1be postgres.exe!main+0x22b postgres.exe+0x1237 postgres.exe+0x1288 kernel32.dll!RegisterWaitForInputIdle+0x49 logger: ntdll.dll!KiFastSystemCallRet kernel32.dll!WaitForSingleObject+0x12 postgres.exe!pg_usleep+0x54 postgres.exe!SysLoggerMain+0x422 postgres.exe!SubPostmasterMain+0x370 postgres.exe!main+0x22b postgres.exe+0x1237 postgres.exe+0x1288 kernel32.dll!RegisterWaitForInputIdle+0x49 bgwriter: ntdll.dll!KiFastSystemCallRet kernel32.dll!WaitForSingleObject+0x12 postgres.exe!pg_usleep+0x54 postgres.exe!BackgroundWriterMain+0x63a postgres.exe!BootstrapMain+0x61f postgres.exe!SubPostmasterMain+0x22c postgres.exe!main+0x22b postgres.exe+0x1237 postgres.exe+0x1288 kernel32.dll!RegisterWaitForInputIdle+0x49 stat collector: ntdll.dll!KiFastSystemCallRet postgres.exe!pgwin32_select+0x4f3 postgres.exe!PgstatCollectorMain+0x32f postgres.exe!SubPostmasterMain+0x32a postgres.exe!main+0x22b postgres.exe+0x1237 postgres.exe+0x1288 kernel32.dll!RegisterWaitForInputIdle+0x49 -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/ *** ./src/backend/port/win32/socket.c.orig Mon Oct 9 10:39:53 2006 --- ./src/backend/port/win32/socket.c Mon Oct 9 15:44:24 2006 *************** *** 132,137 **** --- 132,159 ---- current_socket = s; + /* + * See comments about FD_WRITE and WSASelectEvent + * in pgwin32_select() + */ + if ( (what & FD_WRITE) != 0 ) { + char c; + WSABUF buf; + DWORD sent; + + buf.buf = &c; + buf.len = 0; + r = WSASend(s, &buf, 1, &sent, 0, NULL, NULL); + + if (r == 0) /* Completed - means things are fine! */ + return 1; + else if ( WSAGetLastError() != WSAEWOULDBLOCK ) + { + TranslateSocketError(); + return 0; + } + } + if (WSAEventSelect(s, waitevent, what) == SOCKET_ERROR) { TranslateSocketError();
pgsql-hackers by date: