RE: Libpq support to connect to standby server as priority - Mailing list pgsql-hackers
From | Smith, Peter |
---|---|
Subject | RE: Libpq support to connect to standby server as priority |
Date | |
Msg-id | f6daf26354054924998670cf03770352@G06AUEXCASMHB04.g06.fujitsu.local Whole thread Raw |
In response to | Re: Libpq support to connect to standby server as priority (Greg Nancarrow <gregn4422@gmail.com>) |
List | pgsql-hackers |
Hi Greg, I have spent some time reading this discussion thread, and doing a code review of the latest (v17-0001) patch. Below are my review comments; some are trivial, others not so much. ==== GENERAL COMMENT 1 ("any") "any" should be included as valid option for target_server_type. IIUC target_server_type was added mostly to have better alignment with JDBC options. Both Vladimir [1] and Dave [2] already said that JDBC does have an "any" option. [1] - https://www.postgresql.org/message-id/CAB%3DJe-FwOVE%3D8gR1UDDZRnWZR65fRG40e8zW_U_6mnUqbce68g%40mail.gmail.com [2] - https://www.postgresql.org/message-id/CADK3HHJ9316ji7L-97cJBY%3Dwp4E3ddPMn8XdkNz6j8d9u0OhmQ%40mail.gmail.com Furthermore, the fe-connect.c function makeEmptyPGConn sets default: + conn->requested_server_type = SERVER_TYPE_ANY; This means the default type of target_server_type is "any". Since this is default, it should also be possible to assign the same value to explicitly. (Parts of the v17 patch affected by this are itemised below) ==== GENERAL COMMENT 2 (Removal of pg_is_in_recovery) Around 22/3/2019 Hari added a lot of pg_is_in_recovery code in his patch 0006 [1] [1] - https://www.postgresql.org/message-id/CAJrrPGd4YeA%2BN%3DxC%2B1XPVoGzMCATJZY4irVQEJ6i0aPqorUi7g%40mail.gmail.com Much later IIUC the latest v17 patch has taken onboard the recommendation from Alvaro and removed all that code: "I would discard the whole thing about checking "SELECT pg_is_in_recovery()"" [2] [2] - https://www.postgresql.org/message-id/20191227130828.GA21647%40alvherre.pgsql However, it seems that not ALL parts of the original code got cleanly removed in v17. There are a number of references to CONNECTION_CHECK_RECOVERY and pg_is_in_recovery still lurking. (Parts of the v17 patch affected by this are itemised below) ==== COMMENT libpq.sgml (para blocks) + <para> The v17 patch for target_session_attrs and target_server_type help is currently using <para> blocks for each of the possibleoption values. This format is inconsistent document style with other variables in this SGML. Other places are using sub-lists for option values. e.g. look at "six modes" of sslmode. ==== COMMENT libpq.sgml (cut/paste parameter description) I don't think that target_server_type help should be just a cut/paste duplicate of target_session_attrs. It is confusingbecause it leaves the reader doubting the purpose of having such a duplication. Suggest to simplify the target_server_type help like as follows: -- target_server_type The purpose of this parameter is to reflect the similar PGJDBC targetServerType. The supported options are same as target_session_attrs. This parameter overrides any connection type specified by target_session_attrs. -- ==== COMMENT libpq.sgml (pg_is_in_recovery) (As noted in GENERAL COMMENT 2 there are still residual references to pg_is_in_recovery) + <para> + If this parameter is set to <literal>standby</literal>, only a connection in which + the server is in recovery mode is considered acceptable. If the server is prior to version 14, + the query <literal>SELECT pg_is_in_recovery()</literal> will be sent upon any successful + connection; if it returns <literal>t</literal>, it means the server is in recovery mode. + </para> Suggest change to: -- If this parameter is set to <literal>standby</literal>, only a connection in which the server is in recovery mode is consideredacceptable. The recovery mode state is determined by the value of the in_recovery configuration parameter thatis reported by the server upon successful connection. Otherwise, if the server is prior to version 14, only a connectionin which read-write transactions are not accepted by default is considered acceptable. To determine whether theserver supports read-write transactions, the query SHOW transaction_read_only will be sent upon any successful connection;if it returns on, it means the server doesn't support read-write transactions. -- ==== COMMENT libpq.sgml (Oxford comma) + <varname>integer_datetimes</varname>, + <varname>standard_conforming_strings</varname> and + <varname>in_recovery</varname>. Previously there was an Oxford comma (e.g. before the "and"). Now there isn't. The v17 patch should not alter the previous listing style. ==== COMMENT protocol.sgml (Oxford comma) + <varname>integer_datetimes</varname>, + <varname>standard_conforming_strings</varname> and + <varname>in_recovery</varname>. Previously there was an Oxford comma (e.g. before the "and"). Now there isn't. The v17 patch should not alter the previous listing style. ==== QUESTION standby.c - SendRecoveryExitSignal +/* + * SendRecoveryExitSignal + * Signal backends that the server has exited recovery mode. + */ +void +SendRecoveryExitSignal(void) +{ + SendSignalToAllBackends(PROCSIG_RECOVERY_EXIT); +} I wonder if this function is really necessary? IIUC the SendRecoveryExitSignal is only called from one place (xlog.c). Why not just call SendSignalToAllBackends directly from there and remove this extra layer? ==== COMMENT postgres.c (signal comment) + /* signal that work needs to be done */ + recoveryExitInterruptPending = true; Suggest change comment to say: /* flag that work needs to be done */ ==== COMMENT fe-connect.c (sizeof) - "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */ + "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */ According to the SGML "prefer-secondary" is also an acceptable value for target_session_attrs, so the display field widthshould be 17 /* sizeof("prefer-secondary") */ not 15. ==== COMMENT fe-connect.c (CONNECTION_CHECK_RECOVERY) @@ -2310,6 +2461,7 @@ PQconnectPoll(PGconn *conn) case CONNECTION_CHECK_WRITABLE: case CONNECTION_CONSUME: case CONNECTION_GSS_STARTUP: + case CONNECTION_CHECK_RECOVERY: break; (As noted in GENERAL COMMENT 2 there are still residual references to pg_is_in_recovery) Probably this CONNECTION_CHECK_RECOVERY case should be removed. ==== COMMENT fe-connect.c - function validateAndRecordTargetServerType As noted in GENERAL COMMENT 1, I suggest "any" needs to be included in this function as a valid option. ==== COMMENT fe-connect.c (target_session_attrs validation) @@ -1396,8 +1425,9 @@ connectOptions2(PGconn *conn) */ if (conn->target_session_attrs) { - if (strcmp(conn->target_session_attrs, "any") != 0 - && strcmp(conn->target_session_attrs, "read-write") != 0) + if (strcmp(conn->target_session_attrs, "any") == 0) + conn->requested_server_type = SERVER_TYPE_ANY; + else if (!validateAndRecordTargetServerType(conn->target_session_attrs, &conn->requested_server_type)) I suggest introducing a 2nd function for target_session_attrs (e.g. validateAndRecordTargetSessionAttrs). Even though these parameters are functionally the same today, in future they may not be [1]. [1] - https://www.postgresql.org/message-id/20200109152539.GA29017%40alvherre.pgsql Regardless, the special "any" handling can be removed from here because (from GENERAL COMMENT 1) the validateAndRecordTargetServerTypeshould now accept "any". ==== COMMENT fe-connect.c (message typo) Found an existing typo, unrelated to the v17 patch. "target_settion_attrs", --> "target_session_attrs", ==== COMMENT fe-connect.c (libpq_gettext) + printfPQExpBuffer(&conn->errorMessage, + libpq_gettext("invalid target_server_type value: \"%s\"\n"), + conn->target_server_type); The parameter name "target_server_type" should be separated from the format string as "%s", the same as is done by the libpq_gettextof the preceding code. ==== COMMENT fe-connect.c (indentation) + goto error_return; + } } + else conn->whichhost++; Bad indentation of the else's statement. ==== COMMENT fe-connect.c (if/else complexity) + else if ((conn->in_recovery && + conn->requested_server_type == SERVER_TYPE_PRIMARY) || + (!conn->in_recovery && + (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY || + conn->requested_server_type == SERVER_TYPE_STANDBY))) + { TBH I was unable to read this code without first drawing up a matrix of combinations to deduce what was going on. It should not be so inscrutable. Suggestion1: Consider putting a large comment at the top of this CONNECTION_CHECK_TARGET to give the overview what this code is tryingto acheive. e.g. something like this: --- Mode |in_recovery |version < 7.4 |version < 14 |version >= 14 ---------------+------------+-------------------+---------------------------+------------- ANY |NA |OK |OK |OK PRIMARY |true |OK |SHOW transaction_read_only |keep_going PRIMARY |false |OK |SHOW transaction_read_only |OK PREFER_STANDBY |true |keep_going (or -2) |SHOW transaction_read_only |OK PREFER_STANDBY |false |keep_going (or -2) |SHOW transaction_read_only |keep_going (or -2) STANDBY |true |keep_going |SHOW transaction_read_only |OK STANDBY |false |keep_going |SHOW transaction_read_only |keep_going --- Suggestion2: Consider to separate out the requested_server_type cases instead of trying to hand everything in the same else block. Thecode may be a bit longer, but by aligning it more closely with the SGML documentation it can be made easier to understand. e.g. something like this: --- if (conn->requested_server_type == SERVER_TYPE_PRIMARY) { /* If not-in-recovery, reject, else OK. */ if (conn->in_recovery) { rejectCheckedRecoveryConnection(conn); goto keep_going; } goto consume_checked_target_connection; } if (conn->requested_server_type == SERVER_TYPE_STANDBY) { /* Only a connection in recovery mode is acceptable. */ if (!conn->in_recovery) { rejectCheckedRecoveryConnection(conn); goto keep_going; } goto consume_checked_target_connection; } if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY) { /* A connection in recovery mode is preferred. */ if (conn->in_recovery) goto consume_checked_target_connection; /* * The following scenario is possible only for the * prefer-standby type for the next pass of the list * of connections, as it couldn't find any servers that * are in recovery. */ if (conn->which_rw_host == -2) goto consume_checked_target_connection; /* reject function below remembers this r/w host index in case it is needed later */ rejectCheckedRecoveryConnection(conn); goto keep_going; } --- ==== COMMENT fe-connect.c (case CONNECTION_CHECK_RECOVERY) (As noted in GENERAL COMMENT 2 there are still residual references to pg_is_in_recovery) v17 patch has removed the previous call to pg_is_in_recovery. IIUC this means that there is currently no way for the remaining CONNECTION_CHECK_RECOVERY case to even be executed. If I am correct, then a significant slab of code (~100 lines) can be deleted. See case CONNECTION_CHECK_RECOVERY (lines ~ 4007 thru 4110) ==== COMMENT fe-connect.c - function freePGConn (missing free?) There is code to free(conn->target_session_attrs), but there is no code to free target_server_type. Appears to be accidental omission. ==== COMMENT fe-exec.c (altered comment) - * Special hacks: remember client_encoding and + * Special hacks: remember client_encoding, and A comma was added. Suggest avoid altering comments not directly related to the v17 patch logic. ==== COMMENT libpq-fe.h (CONNECTION_CHECK_RECOVERY) + CONNECTION_CHECK_TARGET, /* Check if we have a proper target connection */ + CONNECTION_CHECK_RECOVERY /* Check whether server is in recovery */ (As noted in GENERAL COMMENT 2 there are still residual references to pg_is_in_recovery) Probably this CONNECTION_CHECK_RECOVERY case should be removed. ==== Kind Regards, Peter Smith --- Fujitsu Australia
pgsql-hackers by date: