Thread: Backbranch releases
Agreed we need to push out the back branch releases. Let me know what I can do to help. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes: > Agreed we need to push out the back branch releases. Let me know what I > can do to help. Making up the release notes is the only large bit of work ... do you want to do that? FYI to the rest of you: we're planning back-branch releases before 8.2 final, since 8.1 in particular is well overdue for an update. Current proposal is to wrap tarballs on Thursday 10/12 for public release Monday 10/16. If you've got any back-branch patches sitting around, send 'em in now ... regards, tom lane
Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > Agreed we need to push out the back branch releases. Let me know what I > > can do to help. > > Making up the release notes is the only large bit of work ... do you > want to do that? Sure, and stamping. How far back do you want to go? > FYI to the rest of you: we're planning back-branch releases before 8.2 > final, since 8.1 in particular is well overdue for an update. Current > proposal is to wrap tarballs on Thursday 10/12 for public release Monday > 10/16. If you've got any back-branch patches sitting around, send 'em > in now ... OK. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes: > Sure, and stamping. How far back do you want to go? We might as well go back to 7.3 --- I saw Teodor back-patched some of his contrib/ltree fixes that far. regards, tom lane
On Fri, 6 Oct 2006, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: >> Agreed we need to push out the back branch releases. Let me know what I >> can do to help. > > Making up the release notes is the only large bit of work ... do you > want to do that? > > FYI to the rest of you: we're planning back-branch releases before 8.2 > final, since 8.1 in particular is well overdue for an update. Current > proposal is to wrap tarballs on Thursday 10/12 for public release Monday > 10/16. If you've got any back-branch patches sitting around, send 'em > in now ... We, probably, have one patch for 8.1 stable branch which seems helped with locking on SMP Windows setup. I'm currently testing it and it looks good. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
Oleg Bartunov <oleg@sai.msu.su> writes: > We, probably, have one patch for 8.1 stable branch which seems > helped with locking on SMP Windows setup. I'm currently testing it and > it looks good. Cool, what's the patch? regards, tom lane
On Fri, 6 Oct 2006, Tom Lane wrote: > Oleg Bartunov <oleg@sai.msu.su> writes: >> We, probably, have one patch for 8.1 stable branch which seems >> helped with locking on SMP Windows setup. I'm currently testing it and >> it looks good. > > Cool, what's the patch? Unfortunately, after several hours of testing I just got the same locking. I'll continue testing with Teodor and Magnus. It's real life scenario, so I think it should be fixed for 8.1 stable. Also, the problem exists in 8.2. Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
Analyzing locking state, lock occurs when backend wants to send data to stat collector. So state is: backend waits FD_WRITE event, stat collector waits FD_READ. I suspect follow sequence of events in backend: 0 Let us work only with one socket, and socket associated with statically defined event object in pgwin32_waitforsinglesocket. 1. pgwin32_send:WSASend fails with WSAEWOULDBLOCK ( or its equivalent ) 2. socket s becomes writable and Windows signals event defined statically in pgwin32_waitforsinglesocket. 3. pgwin32_waitforsinglesocket(): ResetEvent resets event 4. pgwin32_waitforsinglesocket(): WaitForMultipleObjectsEx waits indefinitely... If I'm right, it's needed to move ResetEvent after WaitForMultipleObjectsEx. But comment in pgwin32_select() says that we should send something before test socket for FD_WRITE. pgwin32_send calls WSASend before pgwin32_waitforsinglesocket(), but there is a call of pgwin32_waitforsinglesocket in libpq/be-secure.c. So, attached patch adds call of WSASend with void buffer. It's a pity, but locking problem occurs only on SMP box and requires several hours to reproduce. So we are in testing now. What are opinions? PS Backtraces backend: ntdll.dll!KiFastSystemCallRet postgres.exe!pgwin32_waitforsinglesocket+0x197 postgres.exe!pgwin32_send+0xaf postgres.exe!pgstat_report_waiting+0x1bd postgres.exe!pgstat_report_tabstat+0xda postgres.exe!PostgresMain+0x1040 postgres.exe!ClosePostmasterPorts+0x1bce postgres.exe!SubPostmasterMain+0x1be postgres.exe!main+0x22b postgres.exe+0x1237 postgres.exe+0x1288 kernel32.dll!RegisterWaitForInputIdle+0x49 logger: ntdll.dll!KiFastSystemCallRet kernel32.dll!WaitForSingleObject+0x12 postgres.exe!pg_usleep+0x54 postgres.exe!SysLoggerMain+0x422 postgres.exe!SubPostmasterMain+0x370 postgres.exe!main+0x22b postgres.exe+0x1237 postgres.exe+0x1288 kernel32.dll!RegisterWaitForInputIdle+0x49 bgwriter: ntdll.dll!KiFastSystemCallRet kernel32.dll!WaitForSingleObject+0x12 postgres.exe!pg_usleep+0x54 postgres.exe!BackgroundWriterMain+0x63a postgres.exe!BootstrapMain+0x61f postgres.exe!SubPostmasterMain+0x22c postgres.exe!main+0x22b postgres.exe+0x1237 postgres.exe+0x1288 kernel32.dll!RegisterWaitForInputIdle+0x49 stat collector: ntdll.dll!KiFastSystemCallRet postgres.exe!pgwin32_select+0x4f3 postgres.exe!PgstatCollectorMain+0x32f postgres.exe!SubPostmasterMain+0x32a postgres.exe!main+0x22b postgres.exe+0x1237 postgres.exe+0x1288 kernel32.dll!RegisterWaitForInputIdle+0x49 -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/ *** ./src/backend/port/win32/socket.c.orig Mon Oct 9 10:39:53 2006 --- ./src/backend/port/win32/socket.c Mon Oct 9 15:44:24 2006 *************** *** 132,137 **** --- 132,159 ---- current_socket = s; + /* + * See comments about FD_WRITE and WSASelectEvent + * in pgwin32_select() + */ + if ( (what & FD_WRITE) != 0 ) { + char c; + WSABUF buf; + DWORD sent; + + buf.buf = &c; + buf.len = 0; + r = WSASend(s, &buf, 1, &sent, 0, NULL, NULL); + + if (r == 0) /* Completed - means things are fine! */ + return 1; + else if ( WSAGetLastError() != WSAEWOULDBLOCK ) + { + TranslateSocketError(); + return 0; + } + } + if (WSAEventSelect(s, waitevent, what) == SOCKET_ERROR) { TranslateSocketError();
> Analyzing locking state, lock occurs when backend wants to send > data to stat collector. So state is: > backend waits FD_WRITE event, stat collector waits FD_READ. > > I suspect follow sequence of events in backend: > 0 Let us work only with one socket, and socket associated with > statically > defined event object in pgwin32_waitforsinglesocket. > 1. pgwin32_send:WSASend fails with WSAEWOULDBLOCK ( or its > equivalent ) 2. socket s becomes writable and Windows signals event > defined statically > in pgwin32_waitforsinglesocket. > 3. pgwin32_waitforsinglesocket(): ResetEvent resets event 4. > pgwin32_waitforsinglesocket(): WaitForMultipleObjectsEx waits > indefinitely... > > > If I'm right, it's needed to move ResetEvent after > WaitForMultipleObjectsEx. But comment in pgwin32_select() says that > we should send something before test socket for FD_WRITE. > pgwin32_send calls WSASend before pgwin32_waitforsinglesocket(), > but there is a call of pgwin32_waitforsinglesocket in libpq/be- > secure.c. So, attached patch adds call of WSASend with void buffer. Hmm. Not entirely sure. These are all in the SSL codepath. Are you using SSL on the machine? Does the problem go away if you don't? (I was thinking SSL always attempts to write data first, but then fails, at which point this code is fine. You only need to attempt a send at it if you didn't try that before) The normal way is that pgwin32_waitforsinglesocket is called from pgwin32_send(), which will always have made the attempt to send data first. > It's a pity, but locking problem occurs only on SMP box and > requires several hours to reproduce. So we are in testing now. Yikes, that's definitely not nice :-) //Magnus
> Hmm. Not entirely sure. These are all in the SSL codepath. Are you using > SSL on the machine? Does the problem go away if you don't? (I was No, we don;t use SSL. > The normal way is that pgwin32_waitforsinglesocket is called from > pgwin32_send(), which will always have made the attempt to send data > first. My doubt is: can ResetEvent resets signaled state of associated event object? Look, in any case pgwin32_waitforsinglesocket() resets event before WSAEventSelect(). pgwin32_send() calls WSASend and if it fails, call pgwin32_waitforsinglesocket(). > >> It's a pity, but locking problem occurs only on SMP box and >> requires several hours to reproduce. So we are in testing now. > > Yikes, that's definitely not nice :-) > > //Magnus > -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
:(( Patch doesn't work. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > Sure, and stamping. How far back do you want to go? > > We might as well go back to 7.3 --- I saw Teodor back-patched some of > his contrib/ltree fixes that far. > Back branches are ready for release. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +