Re: pgsql: Add tests for '-f' option in dropdb utility. - Mailing list pgsql-committers
From | Amit Kapila |
---|---|
Subject | Re: pgsql: Add tests for '-f' option in dropdb utility. |
Date | |
Msg-id | CAA4eK1+GNyjaPK77y+euh5eAgM75pncG1JYZhxYZF+SgS6NpjA@mail.gmail.com Whole thread Raw |
In response to | Re: pgsql: Add tests for '-f' option in dropdb utility. (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: pgsql: Add tests for '-f' option in dropdb utility.
Re: pgsql: Add tests for '-f' option in dropdb utility. |
List | pgsql-committers |
On Fri, Nov 29, 2019 at 7:42 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Nov 29, 2019 at 2:25 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > > * On all the failing machines, it's very clear from the postmaster > > log that the backend knows why it's being terminated: > > > > 2019-11-28 13:47:56.320 UTC [212024:4] 051_dropdb_force.pl FATAL: terminating connection due to administrator command > > > > Yeah, I can confirm this behavior on the Windows machines. I have > also seen that we already expose this behavior in more than one way > and all behave similar to this. If I use pg_terminate_backend(<pid>) > or pg_ctl kill TERM <pid>, the behavior is exactly the same > (terminated backend doesn't get the message (FATAL: terminating > connection due to administrator command), but it is present in > postmaster log. > > > So the question seems to be why libpq isn't reporting that > > message before it detects connection-closed. > > > > This triggered a vague memory, and after a bit of archives-digging, > > I found this thread from a few months back: > > > > https://www.postgresql.org/message-id/flat/CA%2BhUKGJowQypXSKsjws9A%2BnEQDD0-mExHZqFXtJ09N209rCO5A%40mail.gmail.com#0629f079bc59ecdaa0d6ac9f8f2c18ac > > > > in which it's alleged that Windows' TCP stack is flat-out > > broken and will drop not-yet-delivered data when the server > > closes the connection. > > > > If that's true, it's pretty nasty. Windows is about the > > last platform where I'd want us to have behavior like this, > > because we *will* get bug reports about it from novices. > > > > If there's no other workaround, I'm tempted to propose > > > > #ifdef WIN32 > > pg_sleep(1 second); > > #endif > > > > or something close to that, before we close the socket. > > > > I can experiment with this or if something else occurred to me. > The above experiment suggested by you works and the test passes after this in the local Windows setup. One more thing, I think we don't need to do above for graceful exits and during socket_close we have access to exit_status_code which can be used in this context. I have attached the patch for this. I think something on these lines is even suggested by MSDN [1], but I am not sure. I have also tried calling closesocket() explicitly in our function socket_close which has changed the error message to "could not receive data from server: Software caused connection abort (0x00002745/10053)". I am not sure if it is any better. Another thing I have tried to use is SO_LINGER option of socket but that didn't work out. > > Or we could revert the whole feature. > > > > Yeah, that is also one possibility, but I think given we already have > this behavior in existing features, it is better to either come up > with some solution or maybe mention in docs that in such cases users > need to check postmaster log to know the actual reason. > > I think we can further explore this, but for now, we might want to (a) > revert this test, or (b) change the expected output to match. > I think if we decide to go with the above solution for Windows, then we can keep the current test as it is or probably remove one test related to <pid> test, otherwise, it is better to go with (a) for now and once we decide any solution for this we can again commit these tests. [1] - https://docs.microsoft.com/en-us/windows/win32/api/winsock/nf-winsock-closesocket See below note in the above link: Using the closesocket or shutdown functions with SD_SEND or SD_BOTH results in a RELEASE signal being sent out on the control channel. Due to ATM's use of separate signal and data channels, it is possible that a RELEASE signal could reach the remote end before the last of the data reaches its destination, resulting in a loss of that data. One possible solutions is programming a sufficient delay between the last data sent and the closesocket or shutdown function calls for an ATM socket. Half close is not supported by ATM. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Attachment
pgsql-committers by date: