Thread: Improve the connection failure error messages
Hi Hackers, Various sections of the code utilize the walrcv_connect() function, employed by various processes such as walreceiver, logical replication apply worker, etc., to establish connections with other hosts. Presently, in case of connection failures, the error message lacks information about the specific process attempting to connect and encountering the failure. The provided patch enhances error messages for such connection failures, offering more details on the processes that failed to establish a connection. -- Thanks, Nisha
Attachment
Thanks for the patch! Here are a couple of review comments for it. ====== src/backend/commands/subscriptioncmds.c 1. @@ -742,7 +742,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt, if (!wrconn) ereport(ERROR, (errcode(ERRCODE_CONNECTION_FAILURE), - errmsg("could not connect to the publisher: %s", err))); + errmsg("\"%s\" could not connect to the publisher: %s", stmt->subname, err))); In practice, these commands give errors like: test_sub=# create subscription sub1 connection 'dbname=bogus' publication pub1; ERROR: could not connect to the publisher: connection to server on socket "/tmp/.s.PGSQL.5432" failed: FATAL: database "bogus" does not exist and logs like: 2024-01-12 12:45:05.177 AEDT [13108] ERROR: could not connect to the publisher: connection to server on socket "/tmp/.s.PGSQL.5432" failed: FATAL: database "bogus" does not exist 2024-01-12 12:45:05.177 AEDT [13108] STATEMENT: create subscription sub1 connection 'dbname=bogus' publication pub1; Since the subscription name is already obvious from the statement that caused the error I'm not sure it benefits much to add this to the error message (but maybe it is useful if the error message was somehow read in isolation from the statement). Anyway, I felt at least it should include the word "subscription" for better consistency with the other messages in this patch: SUGGESTION subscription \"%s\" could not connect to the publisher: %s ====== 2. + appname = cluster_name[0] ? cluster_name : "walreceiver"; + /* Establish the connection to the primary for XLOG streaming */ - wrconn = walrcv_connect(conninfo, false, false, - cluster_name[0] ? cluster_name : "walreceiver", - &err); + wrconn = walrcv_connect(conninfo, false, false, appname, &err); if (!wrconn) ereport(ERROR, (errcode(ERRCODE_CONNECTION_FAILURE), - errmsg("could not connect to the primary server: %s", err))); + errmsg("%s could not connect to the primary server: %s", appname, err))); I think your new %s should be quoted according to the guidelines at [1]. ====== src/test/regress/expected/subscription.out 3. Apparently, there is no existing regression test case for the ALTER "could not connect" message because if there was, it would have failed. Maybe a test should be added? ====== [1] https://www.postgresql.org/docs/current/error-style-guide.html Kind Regards, Peter Smith. Fujitsu Australia
Thanks for the review. Attached v2 patch with suggested changes. Please find my response inline. On Fri, Jan 12, 2024 at 8:20 AM Peter Smith <smithpb2250@gmail.com> wrote: > > Thanks for the patch! Here are a couple of review comments for it. > > ====== > src/backend/commands/subscriptioncmds.c > > 1. > @@ -742,7 +742,7 @@ CreateSubscription(ParseState *pstate, > CreateSubscriptionStmt *stmt, > if (!wrconn) > ereport(ERROR, > (errcode(ERRCODE_CONNECTION_FAILURE), > - errmsg("could not connect to the publisher: %s", err))); > + errmsg("\"%s\" could not connect to the publisher: %s", stmt->subname, err))); > > In practice, these commands give errors like: > > test_sub=# create subscription sub1 connection 'dbname=bogus' publication pub1; > ERROR: could not connect to the publisher: connection to server on > socket "/tmp/.s.PGSQL.5432" failed: FATAL: database "bogus" does not > exist > > and logs like: > > 2024-01-12 12:45:05.177 AEDT [13108] ERROR: could not connect to the > publisher: connection to server on socket "/tmp/.s.PGSQL.5432" failed: > FATAL: database "bogus" does not exist > 2024-01-12 12:45:05.177 AEDT [13108] STATEMENT: create subscription > sub1 connection 'dbname=bogus' publication pub1; > > Since the subscription name is already obvious from the statement that > caused the error I'm not sure it benefits much to add this to the > error message (but maybe it is useful if the error message was somehow > read in isolation from the statement). > > Anyway, I felt at least it should include the word "subscription" for > better consistency with the other messages in this patch: > > SUGGESTION > subscription \"%s\" could not connect to the publisher: %s Done. > ====== > > 2. > + appname = cluster_name[0] ? cluster_name : "walreceiver"; > + > /* Establish the connection to the primary for XLOG streaming */ > - wrconn = walrcv_connect(conninfo, false, false, > - cluster_name[0] ? cluster_name : "walreceiver", > - &err); > + wrconn = walrcv_connect(conninfo, false, false, appname, &err); > if (!wrconn) > ereport(ERROR, > (errcode(ERRCODE_CONNECTION_FAILURE), > - errmsg("could not connect to the primary server: %s", err))); > + errmsg("%s could not connect to the primary server: %s", appname, err))); > > I think your new %s should be quoted according to the guidelines at [1]. Done. > ====== > src/test/regress/expected/subscription.out > > 3. > Apparently, there is no existing regression test case for the ALTER > "could not connect" message because if there was, it would have > failed. Maybe a test should be added? > The ALTER SUBSCRIPTION command does not error out on the user interface if updated with a bad connection string and the connection failure error can only be seen in the respective log file. Due to this behavior, it is not possible to add a test to show the error message as it is done for CREATE SUBSCRIPTION. Let me know if you think there is another way to add this test. -- Thanks, Nisha
Attachment
Hi, Thanks for the patch. > Due to this behavior, it is not possible to add a test to show the > error message as it is done for CREATE SUBSCRIPTION. > Let me know if you think there is another way to add this test. I believe it can be done with TAP tests, see for instance: contrib/auto_explain/t/001_auto_explain.pl However I wouldn't insist on including the test in scope of this particular patch. Such a test doesn't currently exist, it can be added as a separate patch, and whether this is actually a useful test (all the tests consume time after all...) is somewhat debatable. Personally I agree that it would be nice to have though. This being said... > The ALTER SUBSCRIPTION command does not error out on the user > interface if updated with a bad connection string and the connection > failure error can only be seen in the respective log file. I wonder if we should fix this. Again, not necessarily in scope of this work, but if this is not a difficult task, again, it would be nice to have. Other than that v2 looks good. -- Best regards, Aleksander Alekseev
On Sat, Jan 13, 2024 at 12:36 AM Aleksander Alekseev <aleksander@timescale.com> wrote: > > Hi, > > Thanks for the patch. > > > Due to this behavior, it is not possible to add a test to show the > > error message as it is done for CREATE SUBSCRIPTION. > > Let me know if you think there is another way to add this test. > > I believe it can be done with TAP tests, see for instance: > > contrib/auto_explain/t/001_auto_explain.pl > > However I wouldn't insist on including the test in scope of this > particular patch. Such a test doesn't currently exist, it can be added > as a separate patch, and whether this is actually a useful test (all > the tests consume time after all...) is somewhat debatable. Personally > I agree that it would be nice to have though. > > This being said... > > > The ALTER SUBSCRIPTION command does not error out on the user > > interface if updated with a bad connection string and the connection > > failure error can only be seen in the respective log file. > > I wonder if we should fix this. Again, not necessarily in scope of > this work, but if this is not a difficult task, again, it would be > nice to have. > > Other than that v2 looks good. > OK. I see now that any ALTER of the subscription's connection, even to some value that fails, will restart a new worker (like ALTER of any other subscription parameters). For a bad connection, it will continue to relaunch-worker/ERROR over and over. test_sub=# \r2024-01-17 09:34:28.665 AEDT [11274] LOG: logical replication apply worker for subscription "sub4" has started 2024-01-17 09:34:28.666 AEDT [11274] ERROR: could not connect to the publisher: invalid port number: "-1" 2024-01-17 09:34:28.667 AEDT [928] LOG: background worker "logical replication apply worker" (PID 11274) exited with exit code 1 dRs su2024-01-17 09:34:33.669 AEDT [11391] LOG: logical replication apply worker for subscription "sub4" has started 2024-01-17 09:34:33.669 AEDT [11391] ERROR: could not connect to the publisher: invalid port number: "-1" 2024-01-17 09:34:33.670 AEDT [928] LOG: background worker "logical replication apply worker" (PID 11391) exited with exit code 1 b4 ... I don't really have any opinion if that behaviour is good or bad, but anyway, it is deliberate, and IMO it is outside the scope of your patch, so v2 patch LGTM. ~~ BTW, while experimenting with the bad connection ALTER I also tried setting 'disable_on_error' like below: ALTER SUBSCRIPTION sub4 SET (disable_on_error); ALTER SUBSCRIPTION sub4 CONNECTION 'port = -1'; ...but here the subscription did not become DISABLED as I expected it would do on the next connection error iteration. It remains enabled and just continues to loop relaunch/ERROR indefinitely same as before. That looks like it may be a bug. Thoughts? ====== Kind Regards, Peter Smith. Fujitsu Australia
Thanks for reviewing, please find my response inline. On Wed, Jan 17, 2024 at 4:56 AM Peter Smith <smithpb2250@gmail.com> wrote: > > On Sat, Jan 13, 2024 at 12:36 AM Aleksander Alekseev > <aleksander@timescale.com> wrote: > > > > Hi, > > > > Thanks for the patch. > > > > > Due to this behavior, it is not possible to add a test to show the > > > error message as it is done for CREATE SUBSCRIPTION. > > > Let me know if you think there is another way to add this test. > > > > I believe it can be done with TAP tests, see for instance: > > > > contrib/auto_explain/t/001_auto_explain.pl > > > > However I wouldn't insist on including the test in scope of this > > particular patch. Such a test doesn't currently exist, it can be added > > as a separate patch, and whether this is actually a useful test (all > > the tests consume time after all...) is somewhat debatable. Personally > > I agree that it would be nice to have though. > > > > This being said... > > > > > The ALTER SUBSCRIPTION command does not error out on the user > > > interface if updated with a bad connection string and the connection > > > failure error can only be seen in the respective log file. > > > > I wonder if we should fix this. Again, not necessarily in scope of > > this work, but if this is not a difficult task, again, it would be > > nice to have. > > > > Other than that v2 looks good. > > > > OK. I see now that any ALTER of the subscription's connection, even > to some value that fails, will restart a new worker (like ALTER of any > other subscription parameters). For a bad connection, it will continue > to relaunch-worker/ERROR over and over. > > test_sub=# \r2024-01-17 09:34:28.665 AEDT [11274] LOG: logical > replication apply worker for subscription "sub4" has started > 2024-01-17 09:34:28.666 AEDT [11274] ERROR: could not connect to the > publisher: invalid port number: "-1" > 2024-01-17 09:34:28.667 AEDT [928] LOG: background worker "logical > replication apply worker" (PID 11274) exited with exit code 1 > dRs su2024-01-17 09:34:33.669 AEDT [11391] LOG: logical replication > apply worker for subscription "sub4" has started > 2024-01-17 09:34:33.669 AEDT [11391] ERROR: could not connect to the > publisher: invalid port number: "-1" > 2024-01-17 09:34:33.670 AEDT [928] LOG: background worker "logical > replication apply worker" (PID 11391) exited with exit code 1 > b4 > ... > > I don't really have any opinion if that behaviour is good or bad, but > anyway, it is deliberate, and IMO it is outside the scope of your > patch, so v2 patch LGTM. Upon code review, the ALTER SUBSCRIPTION updates the connection string after checking for parse and a few obvious errors and does not attempt to establish a connection. It is the apply worker running for the respective subscription that will try to connect and fail in case of a bad connection string. To me, it seems an intentional design behavior and I agree that deciding to change or maintain this behavior is out of this patch's scope. -- Thanks, Nisha
> > ~~ > > BTW, while experimenting with the bad connection ALTER I also tried > setting 'disable_on_error' like below: > > ALTER SUBSCRIPTION sub4 SET (disable_on_error); > ALTER SUBSCRIPTION sub4 CONNECTION 'port = -1'; > > ...but here the subscription did not become DISABLED as I expected it > would do on the next connection error iteration. It remains enabled > and just continues to loop relaunch/ERROR indefinitely same as before. > > That looks like it may be a bug. Thoughts? > Ideally, if the already running apply worker in "LogicalRepApplyLoop()" has any exception/error it will be handled and the subscription will be disabled if 'disable_on_error' is set - start_apply(XLogRecPtr origin_startpos) { PG_TRY(); { LogicalRepApplyLoop(origin_startpos); } PG_CATCH(); { if (MySubscription->disableonerr) DisableSubscriptionAndExit(); ... What is happening in this case is that the control reaches the function - run_apply_worker() -> start_apply() -> LogicalRepApplyLoop -> maybe_reread_subscription() ... /* * Exit if any parameter that affects the remote connection was changed. * The launcher will start a new worker but note that the parallel apply * worker won't restart if the streaming option's value is changed from * 'parallel' to any other value or the server decides not to stream the * in-progress transaction. */ if (strcmp(newsub->conninfo, MySubscription->conninfo) != 0 || ... and it sees a change in the parameter and calls apply_worker_exit(). This will exit the current process, without throwing an exception to the caller and the postmaster will try to restart the apply worker. The new apply worker, before reaching the start_apply() [where we handle exception], will hit the code to establish the connection to the publisher - ApplyWorkerMain() -> run_apply_worker() - ... LogRepWorkerWalRcvConn = walrcv_connect(MySubscription->conninfo, true /* replication */ , true, must_use_password, MySubscription->name, &err); if (LogRepWorkerWalRcvConn == NULL) ereport(ERROR, (errcode(ERRCODE_CONNECTION_FAILURE), errmsg("could not connect to the publisher: %s", err))); ... and due to the bad connection string in the subscription, it will error out. [28680] ERROR: could not connect to the publisher: invalid port number: "-1" [3196] LOG: background worker "logical replication apply worker" (PID 28680) exited with exit code 1 Now, the postmaster keeps trying to restart the apply worker and it will keep failing until the connection string is corrected or the subscription is disabled manually. I think this is a bug that needs to be handled in run_apply_worker() when disable_on_error is set. IMO, this bug-fix discussion deserves a separate thread. Thoughts?
On Wed, Jan 17, 2024 at 7:15 PM Nisha Moond <nisha.moond412@gmail.com> wrote: > > > > > ~~ > > > > BTW, while experimenting with the bad connection ALTER I also tried > > setting 'disable_on_error' like below: > > > > ALTER SUBSCRIPTION sub4 SET (disable_on_error); > > ALTER SUBSCRIPTION sub4 CONNECTION 'port = -1'; > > > > ...but here the subscription did not become DISABLED as I expected it > > would do on the next connection error iteration. It remains enabled > > and just continues to loop relaunch/ERROR indefinitely same as before. > > > > That looks like it may be a bug. Thoughts? > > > Ideally, if the already running apply worker in > "LogicalRepApplyLoop()" has any exception/error it will be handled and > the subscription will be disabled if 'disable_on_error' is set - > > start_apply(XLogRecPtr origin_startpos) > { > PG_TRY(); > { > LogicalRepApplyLoop(origin_startpos); > } > PG_CATCH(); > { > if (MySubscription->disableonerr) > DisableSubscriptionAndExit(); > ... > > What is happening in this case is that the control reaches the function - > run_apply_worker() -> start_apply() -> LogicalRepApplyLoop -> > maybe_reread_subscription() > ... > /* > * Exit if any parameter that affects the remote connection was changed. > * The launcher will start a new worker but note that the parallel apply > * worker won't restart if the streaming option's value is changed from > * 'parallel' to any other value or the server decides not to stream the > * in-progress transaction. > */ > if (strcmp(newsub->conninfo, MySubscription->conninfo) != 0 || > ... > > and it sees a change in the parameter and calls apply_worker_exit(). > This will exit the current process, without throwing an exception to > the caller and the postmaster will try to restart the apply worker. > The new apply worker, before reaching the start_apply() [where we > handle exception], will hit the code to establish the connection to > the publisher - > > ApplyWorkerMain() -> run_apply_worker() - > ... > LogRepWorkerWalRcvConn = walrcv_connect(MySubscription->conninfo, > true /* replication */ , > true, > must_use_password, > MySubscription->name, &err); > > if (LogRepWorkerWalRcvConn == NULL) > ereport(ERROR, > (errcode(ERRCODE_CONNECTION_FAILURE), > errmsg("could not connect to the publisher: %s", err))); > ... > and due to the bad connection string in the subscription, it will error out. > [28680] ERROR: could not connect to the publisher: invalid port number: "-1" > [3196] LOG: background worker "logical replication apply worker" (PID > 28680) exited with exit code 1 > > Now, the postmaster keeps trying to restart the apply worker and it > will keep failing until the connection string is corrected or the > subscription is disabled manually. > > I think this is a bug that needs to be handled in run_apply_worker() > when disable_on_error is set. > IMO, this bug-fix discussion deserves a separate thread. Thoughts? Hi Nisha, Thanks for your analysis -- it is the same as my understanding. As suggested, I have created a new thread for any further discussion related to this 'disable_on_error' topic [1]. ====== [1] https://www.postgresql.org/message-id/flat/CAHut%2BPuEsekA3e7ThwzWr%2BUs4x%3DLzkF7DSrED1UsZTUqNrhCUQ%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
On Fri, Jan 12, 2024 at 7:06 PM Aleksander Alekseev <aleksander@timescale.com> wrote: > > Hi, > > Thanks for the patch. > > > Due to this behavior, it is not possible to add a test to show the > > error message as it is done for CREATE SUBSCRIPTION. > > Let me know if you think there is another way to add this test. > > I believe it can be done with TAP tests, see for instance: > > contrib/auto_explain/t/001_auto_explain.pl > > However I wouldn't insist on including the test in scope of this > particular patch. Such a test doesn't currently exist, it can be added > as a separate patch, and whether this is actually a useful test (all > the tests consume time after all...) is somewhat debatable. Personally > I agree that it would be nice to have though. Thank you for providing this information. Yes, I can write a TAP test to check the log for the same error message. I'll attempt it and perform a time analysis. I'm unsure where to appropriately add this test. Any suggestions? Following your suggestion, I won't include the test in the scope of this patch. Instead, I'll start a new thread once I have sufficient information required. -- Thanks, Nisha
AFAIK some recent commits patches (e,g [1] for the "slot sync" development) have created some more cases of "could not connect..." messages. So, you might need to enhance your patch to deal with any new ones in the latest HEAD. ====== [1] https://github.com/postgres/postgres/commit/776621a5e4796fa214b6b29a7ca134f6c138572a Kind Regards, Peter Smith. Fujitsu Australia
AFAIK some recent commits patches (e,g [1] for the "slot sync"
development) have created some more cases of "could not connect..."
messages. So, you might need to enhance your patch to deal with any
new ones in the latest HEAD.
======
[1] https://github.com/postgres/postgres/commit/776621a5e4796fa214b6b29a7ca134f6c138572a
Thank you for the update.
The v3 patch has the changes needed as per the latest HEAD.
--
Thanks,
Nisha
Attachment
On Wed, Jan 31, 2024 at 9:58 PM Nisha Moond <nisha.moond412@gmail.com> wrote: > > >> AFAIK some recent commits patches (e,g [1] for the "slot sync" >> development) have created some more cases of "could not connect..." >> messages. So, you might need to enhance your patch to deal with any >> new ones in the latest HEAD. >> >> ====== >> [1] https://github.com/postgres/postgres/commit/776621a5e4796fa214b6b29a7ca134f6c138572a >> > Thank you for the update. > The v3 patch has the changes needed as per the latest HEAD. > Hi, just going by visual inspection of the v2/v3 patch diffs, the latest v3 LGTM. ====== Kind Regards, Peter Smith. Fujitsu Australia
FYI -- some more code has been pushed since this patch was last updated. AFAICT perhaps you'll want to update this patch again for the following new connection messages on HEAD: - slotfuncs.c [1] - slotsync.c [2] ---------- [1] https://github.com/postgres/postgres/blob/0b84f5c419a300dc1b1a70cf63b9907208e52643/src/backend/replication/slotfuncs.c#L989 [2] https://github.com/postgres/postgres/blob/0b84f5c419a300dc1b1a70cf63b9907208e52643/src/backend/replication/logical/slotsync.c#L1258 Kind Regards, Peter Smith. Fujitsu Australia
On Wed, Mar 13, 2024 at 11:16 AM Peter Smith <smithpb2250@gmail.com> wrote: > > FYI -- some more code has been pushed since this patch was last > updated. AFAICT perhaps you'll want to update this patch again for the > following new connection messages on HEAD: > > - slotfuncs.c [1] > - slotsync.c [2] > > ---------- > [1] https://github.com/postgres/postgres/blob/0b84f5c419a300dc1b1a70cf63b9907208e52643/src/backend/replication/slotfuncs.c#L989 > [2] https://github.com/postgres/postgres/blob/0b84f5c419a300dc1b1a70cf63b9907208e52643/src/backend/replication/logical/slotsync.c#L1258 > Thanks for the update. Here is the v4 patch with changes required in slotfuncs.c and slotsync.c files. -- Thanks, Nisha
Attachment
Hi, just by visual inspection of the v3/v4 patch diffs, the latest v4 LGTM. ====== Kind Regards, Peter Smith. Fujitsu Australia
> On 22 Mar 2024, at 11:42, Nisha Moond <nisha.moond412@gmail.com> wrote: > Here is the v4 patch with changes required in slotfuncs.c and slotsync.c files. - errmsg("could not connect to the primary server: %s", err)); + errmsg("\"%s\" could not connect to the primary server: %s", app_name.data, err)); Messages like this should perhaps have translator comments to indicate what the leading "%s" will contain? -- Daniel Gustafsson
On Fri, Apr 26, 2024 at 1:10 PM Daniel Gustafsson <daniel@yesql.se> wrote: > > > On 22 Mar 2024, at 11:42, Nisha Moond <nisha.moond412@gmail.com> wrote: > > > Here is the v4 patch with changes required in slotfuncs.c and slotsync.c files. > > - errmsg("could not connect to the primary server: %s", err)); > + errmsg("\"%s\" could not connect to the primary server: %s", app_name.data, err)); > > Messages like this should perhaps have translator comments to indicate what the > leading "%s" will contain? Attached v5 patch with the translator comments as suggested. -- Thanks, Nisha
Attachment
Hi, just by visual inspection of the v4/v5 patch diffs, the latest v5 LGTM. ====== Kind Regards, Peter Smith. Fujitsu Australia
FYI - I created a CF entry [1] for this because AFAIK the patch is just waiting for a committer to check if it is OK to be pushed, but maybe nobody has noticed it. ====== [1] https://commitfest.postgresql.org/48/5075/ Kind Regards, Peter Smith. Fujitsu Australia
> On 26 Jun 2024, at 05:46, Peter Smith <smithpb2250@gmail.com> wrote: > > FYI - I created a CF entry [1] for this because AFAIK the patch is > just waiting for a committer to check if it is OK to be pushed, but > maybe nobody has noticed it. Thanks, always good to have a CF entry to track it with. It’s not forgotten though, just waiting for the tree to be branched.Right now focus is on stabilizing what will become v17. ./daniel
Nisha Moond <nisha.moond412@gmail.com> writes: > Attached v5 patch with the translator comments as suggested. I looked at this, and I agree with the goal, but I find just about all of the translator comments unnecessary. The ones that are useful are useful only because the message is violating one of our message style guidelines [1]: When citing the name of an object, state what kind of object it is. Rationale: Otherwise no one will know what “foo.bar.baz” refers to. So, for example, where you have + + /* + * translator: first %s is the subscription name, second %s is the + * error + */ + errmsg("subscription \"%s\" could not connect to the publisher: %s", stmt->subname, err))); I don't find that that translator comment is adding anything. But there are a couple of places like + /* + * translator: first %s is the slotsync worker name, second %s is the + * error + */ + errmsg("\"%s\" could not connect to the primary server: %s", app_name.data, err)); I think that the right cure for the ambiguity here is not to add a translator comment, but to label the name properly, perhaps like errmsg("synchronization worker \"%s\" could not connect to the primary server: %s", app_name.data, err)); regards, tom lane [1] https://www.postgresql.org/docs/devel/error-style-guide.html#ERROR-STYLE-GUIDE-OBJECT-TYPE
On Tue, Jul 9, 2024 at 1:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Nisha Moond <nisha.moond412@gmail.com> writes: > > Attached v5 patch with the translator comments as suggested. > > I looked at this, and I agree with the goal, but I find just about all > of the translator comments unnecessary. The ones that are useful are > useful only because the message is violating one of our message style > guidelines [1]: > > When citing the name of an object, state what kind of object it is. > > Rationale: Otherwise no one will know what “foo.bar.baz” refers to. > > So, for example, where you have > > + > + /* > + * translator: first %s is the subscription name, second %s is the > + * error > + */ > + errmsg("subscription \"%s\" could not connect to the publisher: %s", stmt->subname, err))); > > I don't find that that translator comment is adding anything. > But there are a couple of places like > > + /* > + * translator: first %s is the slotsync worker name, second %s is the > + * error > + */ > + errmsg("\"%s\" could not connect to the primary server: %s", app_name.data, err)); > > I think that the right cure for the ambiguity here is not to add a > translator comment, but to label the name properly, perhaps like > > errmsg("synchronization worker \"%s\" could not connect to the primary server: %s", > app_name.data, err)); > > > regards, tom lane > > [1] https://www.postgresql.org/docs/devel/error-style-guide.html#ERROR-STYLE-GUIDE-OBJECT-TYPE Thank you for the review. Attached the patch v6 with suggested improvements. - Removed unnecessary translator comments. - Added appropriate identifier names where missing. -- Thanks, Nisha
Attachment
Nisha Moond <nisha.moond412@gmail.com> writes: > Attached the patch v6 with suggested improvements. > - Removed unnecessary translator comments. > - Added appropriate identifier names where missing. I think this is generally OK, although one could argue that it's violating our message style guideline that primary error messages should be short [1]. The text itself isn't that bad, but once you tack on a libpq connection failure message it's hard to claim that the result "fits on one line". Another way we could address this that'd reduce that problem is to leave the primary messages as-is and add an errdetail or errcontext message to show what's throwing the error. However, I'm not convinced that's better. The main argument against it is that detail/context lines can get lost, eg if you're running the server in terse logging mode. On balance I think it's OK, so I pushed it. I did take out a couple of uses of "logical replication" that seemed unnecessary and were making the length problem worse. regards, tom lane [1] https://www.postgresql.org/docs/devel/error-style-guide.html#ERROR-STYLE-GUIDE-WHAT-GOES-WHERE