Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away |
Date | |
Msg-id | 4bd199f7-9cb8-e15e-cd5a-ec56397f6271@oss.nttdata.com Whole thread Raw |
In response to | Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>) |
Responses |
Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away
|
List | pgsql-hackers |
On 2020/09/21 12:44, Bharath Rupireddy wrote: > On Thu, Sep 17, 2020 at 10:20 PM Fujii Masao <masao.fujii@oss.nttdata.com <mailto:masao.fujii@oss.nttdata.com>> wrote: > > > > In 1st way, you may also need to call ReleaseExternalFD() when new connection fails > > to be made, as connect_pg_server() does. Also we need to check that > > non-superuser has used password to make new connection, > > as connect_pg_server() does? I'm concerned about the case where > > pg_hba.conf is changed accidentally so that no password is necessary > > at the remote server and the existing connection is terminated. In this case, > > if we connect to the local server as non-superuser, connection to > > the remote server should fail because the remote server doesn't > > require password. But with your patch, we can successfully reconnect > > to the remote server. > > > > Therefore I like 2nd option. Also maybe disconnect_ps_server() needs to > > be called before that. I'm not sure how much useful 1st option is. > > > > Thanks. Above points look sensible. +1 for the 2nd option i.e. instead of PQreset(entry->conn);, let's try to disconnect_pg_server()and connect_pg_server(). > > > > > What if 2nd attempt happens with have_prep_stmt=true? I'm not sure > > if this case really happens, though. But if that can, it's strange to start > > new connection with have_prep_stmt=true even when the caller of > > GetConnection() doesn't intend to create any prepared statements. > > > > I think it's safer to do 2nd attempt in the same way as 1st one. Maybe > > we can simplify the code by making them into common code block > > or function. > > > > I don't think the have_prep_stmt will be set by the time we make 2nd attempt because entry->have_prep_stmt |= will_prep_stmt;gets hit only after we are successful in either 1st attempt or 2nd attempt. I think we don't need to set alltransient state. No other transient state except changing_xact_state changes from 1st attempt to 2nd attempt(see below),so let's set only entry->changing_xact_state to false before 2nd attempt. > > 1st attempt: > (gdb) p *entry > $3 = {key = 16389, conn = 0x55a896199990, xact_depth = 0, have_prep_stmt = false, > have_error = false, changing_xact_state = false, invalidated = false, > server_hashvalue = 3905865521, mapping_hashvalue = 2617776010} > > 2nd attempt i.e. in retry block: > (gdb) p *entry > $4 = {key = 16389, conn = 0x55a896199990, xact_depth = 0, have_prep_stmt = false, > have_error = false, changing_xact_state = true, invalidated = false, > server_hashvalue = 3905865521, mapping_hashvalue = 2617776010} > > >> > > > If an error occurs in the first attempt, we return from > > > pgfdw_get_result()'s if (!PQconsumeInput(conn)) to the catch block we > > > added and pgfdw_report_error() will never get called. And the below > > > part of the code is reached only in scenarios as mentioned in the > > > comments. Removing this might have problems if we receive errors other > > > than CONNECTION_BAD or for subtxns. We could clear if any result and > > > just rethrow the error upstream. I think no problem having this code > > > here. > > > > But the orignal code works without this? > > Or you mean that the original code has the bug? > > > > There's no bug in the original code. Sorry, I was wrong in saying pgfdw_report_error() will never get called with thispatch. Indeed it gets called even when 1's attempt connection is failed. Since we added an extra try-catch block, wewill not be throwing the error to the user, instead we make a 2nd attempt from the catch block. > > I'm okay to remove below part of the code > > > >> + PGresult *res = NULL; > > >> + res = PQgetResult(entry->conn); > > >> + PQclear(res); > > >> Are these really necessary? I was just thinking that's not because > > >> pgfdw_get_result() and pgfdw_report_error() seem to do that > > >> already in do_sql_command(). > > Please let me know if okay with the above agreed points, I will work on the new patch. Yes, please work on the patch! Thanks! I may revisit the above points later, though ;) Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
pgsql-hackers by date: