Re: pgsql: postgres_fdw: reestablish new connection if cached one is detect - Mailing list pgsql-committers
From | Fujii Masao |
---|---|
Subject | Re: pgsql: postgres_fdw: reestablish new connection if cached one is detect |
Date | |
Msg-id | eeec0120-9ca1-bd20-a00b-5fc5f2c862a1@oss.nttdata.com Whole thread Raw |
In response to | Re: pgsql: postgres_fdw: reestablish new connection if cached one is detect (Fujii Masao <masao.fujii@oss.nttdata.com>) |
Responses |
Re: pgsql: postgres_fdw: reestablish new connection if cached one is detect
|
List | pgsql-committers |
On 2020/10/08 0:48, Fujii Masao wrote: > > > On 2020/10/07 22:25, Fujii Masao wrote: >> >> >> On 2020/10/07 12:54, Fujii Masao wrote: >>> >>> >>> On 2020/10/07 11:13, Michael Paquier wrote: >>>> Hi Fujii-san, >>>> >>>> On Tue, Oct 06, 2020 at 01:52:55AM +0000, Fujii Masao wrote: >>>>> postgres_fdw: reestablish new connection if cached one is detected as broken. >>>>> >>>>> In postgres_fdw, once remote connections are established, they are cached >>>>> and re-used for subsequent queries and transactions. There can be some >>>>> cases where those cached connections are unavaiable, for example, >>>>> by the restart of remote server. In these cases, previously an error was >>>>> reported and the query accessing to remote server failed if new remote >>>>> transaction failed to start because the cached connection was broken. >>>>> >>>>> This commit improves postgres_fdw so that new connection is remade >>>>> if broken connection is detected when starting new remote transaction. >>>>> This is useful to avoid unnecessary failure of queries when connection is >>>>> broken but can be reestablished. >>>> >>>> lorikeet is telling that the test introduced by this commit is >>>> unstable: >>>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lorikeet&dt=2020-10-06%2008%3A28%3A36 >>> >>> Thanks for letting me know this! >>> >>>> >>>> Some details: >>>> BEGIN; >>>> SELECT 1 FROM ft1 LIMIT 1; >>>> - ?column? >>>> ----------- >>>> - 1 >>>> -(1 row) >>>> - >>>> +ERROR: could not receive data from server: Software caused connection abort >>>> +CONTEXT: remote SQL command: START TRANSACTION ISOLATION LEVEL REPEATABLE READ >>> >>> This error means that new connection was successfully reestablished >>> after the cached connection was terminated, and then the above connection >>> error occurred when issuing "START TRANSACTION" command on that >>> new connection. There seems no suspicious relevant log messages in the >>> logfile. So I'm not sure why this error happened, yet. >>> >>> Per the previous discusson at [1], lorikeet sometimes seems to cause >>> connection-relation failure in the regression test. So the cause of error >>> that we faced today also may be lorikeet itself. >> >> Since it's not good to keep the buildfarm member red, I will revert >> the commit unless I come up with something even after further >> investigation. >> >> My current just guess is that PQstatus(conn) doesn't indicate >> CONNECTION_BAD when the above error occurs, and which >> prevents new connection from being reestablished because of >> the following check. >> >> + if (PQstatus(entry->conn) != CONNECTION_BAD || >> + entry->xact_depth > 0 || >> + retry_conn) >> + PG_RE_THROW(); > > The error message in discussion is reported when recv() fails and > errno=ECONNABORTED. As far as I read the code, pqReadData() marks > the connection as CONNECTION_BAD when errno=ECONNRESET, > but not when errno=ECONNABORTED. So since PQstatus(entry->conn) > doesn't indicate CONNECTION_BAD in ECONNABORTED case, > the above check is passed through, an error is re-thrown and > new connection is not reestablished. > > Therefore, the easy fix is to make libpq mark the connection as > CONNECTION_BAD even in ECONNABORTED, like we do in ECONNRESET. Patch attached. This patch also changes errcode_for_socket_access() so that it uses ERRCODE_CONNECTION_FAILURE rather than ERRCODE_INTERNAL_ERROR as sqlerrorcode in ECONNABORTED case like ECONNRESET. Is this sane? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Attachment
pgsql-committers by date: