Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors - Mailing list pgsql-hackers
From | Marina Polyakova |
---|---|
Subject | Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors |
Date | |
Msg-id | a6299efa310e1d106144f761259f4fd6@postgrespro.ru Whole thread Raw |
In response to | Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors (Fabien COELHO <coelho@cri.ensmp.fr>) |
Responses |
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
|
List | pgsql-hackers |
On 09-07-2018 16:05, Fabien COELHO wrote: > Hello Marina, Hello, Fabien! > Here is a review for the last part of your v9 version. Thank you very much for this! > Patch does not "git apply" (may anymore): > error: patch failed: doc/src/sgml/ref/pgbench.sgml:513 > error: doc/src/sgml/ref/pgbench.sgml: patch does not apply Sorry, I'll send a new version soon. > However I could get it to apply with the "patch" command. > > Then patch compiles, global & pgbench "make check" are ok. :-) > Feature > ======= > > The patch adds the ability to restart transactions (i.e. the full > script) > on some errors, which is a good thing as it allows to exercice postgres > performance in more realistic scenarii. > > * -d/--debug: I'm not in favor in requiring a mandatory text argument > on this > option. It is not pratical, the user has to remember it, and it is a > change. > I'm sceptical of the overall debug handling changes. Maybe we could > have > multiple -d which lead to higher debug level, but I'm not sure that it > can be > made to work for this case and still be compatible with the previous > behavior. > Maybe you need a specific option for your purpose, eg "--debug-retry"? As you wrote in [1], adding an additional option is also a bad idea: > I'm sceptical of the "--debug-fails" options. ISTM that --debug is > already there > and should just be reused. Maybe it's better to use an optional argument/arguments for compatibility (--debug[=fails] or --debug[=NUM])? But if we use the numbers, now I can see only 2 levels, and there's no guarantee that they will no change.. > Code > ==== > > * The implementation is less complex that the previous submission, > which is a good thing. I'm not sure that all the remaining complexity > is still fully needed. > > * I'm reserved about the whole ereport thing, see comments in other > messages. Thank you, I'll try to implement the error reporting in the way you suggested. > Leves ELEVEL_LOG_CLIENT_{FAIL,ABORTED} & LOG_MAIN look unclear to me. > In particular, the "CLIENT" part is not very useful. If the > distinction makes sense, I would have kept "LOG" for the initial one > and > add other ones for ABORT and PGBENCH, maybe. Ok! > * There are no comments about "retries" in StatData, CState and Command > structures. > > * Also, for StatData, I would like to understand the logic between cnt, > skipped, retries, retried, errors, ... so a clear information about the > expected invariant if any would be welcome. One has to go in the code > to > understand how these fields relate one to the other. > > <...> > > * How "errors" differs from "ecnt" is unclear to me. Thank you, I'll fix this. > * commandFailed: I think that it should be kept much simpler. In > particular, having errors on errors does not help much: on > ELEVEL_FATAL, it ignores the actual reported error and generates > another error of the same level, so that the initial issue is hidden. > Even if these are can't happen cases, hidding the origin if it occurs > looks unhelpful. Just print it directly, and maybe abort if you think > that it is a can't happen case. Oh, thanks, my mistake( > * copyRandomState: just use sizeof(RandomState) instead of making > assumptions > about the contents of the struct. Also, this function looks pretty > useless, > why not just do a plain assignment? > > * copyVariables: lacks comments to explain that the destination is > cleaned up > and so on. The cleanup phase could probaly be in a distinct function, > so that > the code would be clearer. Maybe the function variable names are too > long. Thank you, I'll fix this. > if (current_source->svalue) > > in the context of a guard for a strdup, maybe: > > if (current_source->svalue != NULL) I'm sorry, I'll fix this. > * I do not understand the comments on CState enum: "First, remember the > failure > in CSTATE_FAILURE. Then process other commands of the failed > transaction if any" > Why would other commands be processed at all if the transaction is > aborted? > For me any error must leads to the rollback and possible retry of the > transaction. This comment needs to be clarified. It should also say > that on FAILURE, it will go either to RETRY or ABORTED. See below my > comments about doCustom. > > It is unclear to me why their could be several failures within a > transaction, as I would have stopped that it would be aborted on the > first one. > > * I do not undestand the purpose of first_failure. The comment should > explain > why it would need to be remembered. From my point of view, I'm not > fully > convinced that it should. > > <...> > > * executeCondition: this hides client automaton state changes which > were > clearly visible beforehand in the switch, and the different handling of > if & elif is also hidden. > > I'm against this unnecessary restructuring and to hide such an > information, > all state changes should be clearly seen in the state switch so that it > is > easier to understand and follow. > > I do not see why touching the conditional stack on internal errors > (evaluateExpr failure) brings anything, the whole transaction will be > aborted > anyway. > > * doCustom changes. > > On CSTATE_START_COMMAND, it considers whether to retry on the end. > For me, this cannot happen: if some command failed, then it should have > skipped directly to the RETRY state, so that you cannot get to the end > of the script with an error. Maybe you could assert that the state of > the > previous command is NO_FAILURE, though. > > On CSTATE_FAILURE, the next command is possibly started. Although there > is some > consistency with the previous point, I think that it totally breaks the > state > automaton where now a command can start while the whole transaction is > in failing state anyway. There was no point in starting it in the first > place. > > So, for me, the FAILURE state should record/count the failure, then > skip > to RETRY if a retry is decided, else proceed to ABORT. Nothing else. > This is much clearer that way. > > Then RETRY should reinstate the global state and proceed to start the > *first* > command again. > > <...> > > It is unclear to me why backslash command errors are turned to FAILURE > instead of ABORTED: there is no way they are going to be retried, so > maybe they should/could skip directly to ABORTED? > > Function executeCondition is a bad idea, as stated above. So do you propose to execute the command "ROLLBACK" without calculating its latency etc. if we are in a failed transaction and clear the conditional stack after each failure? Also just to be clear: do you want to have the state CSTATE_ABORTED for client abortion and another state for interrupting the current transaction? > The current RETRY state does memory allocations to generate a message > with buffer allocation and so on. This looks like a costly and useless > operation. If the user required "retries", then this is normal > behavior, > the retries are counted and will be printed out in the final report, > and there is no point in printing out every single one of them. > Maybe you want that debugging, but then coslty operations should be > guarded. I think we need these debugging messages because, for example, if you use the option --latency-limit, you we will never know in advance whether the serialization/deadlock failure will be retried or not. They also help to understand which limit of retries was violated or how close we were to these limits during the execution of a specific transaction. But I agree with you that they are costly and can be skipped if the failure type is never retried. Maybe it is better to split them into multiple error function calls?.. > * reporting > > The number of transactions above the latency limit report can be > simplified. > Remove the if and just use one printf f with a %s for the optional > comment. > I'm not sure this optional comment is useful there. Oh, thanks, my mistake( > Before the patch, ISTM that all lines relied on one printf. you have > changed to a style where a collection of printf is used to compose a > line. I'd suggest to keep to the previous one-printf-prints-one-line > style, where possible. Ok! > You have added 20-columns alignment prints. This looks like too much > and > generates much too large lines. Probably 10 (billion) would be enough. I have already asked you about this in [2]: > The variables for the numbers of failures and retries are of type int64 > since the variable for the total number of transactions has the same > type. That's why such a large alignment (as I understand it now, enough > 20 characters). Do you prefer floating alignemnts, depending on the > maximum number of failures/retries for any command in any script? > Some people try to parse the output, so it should be deterministic. I'd > add > the needed columns always if appropriate (i.e. under retry), even if > none > occured. Ok! > * processXactStats: An else is replaced by a detailed stats, with the > initial > "no detailed stats" comment kept. The function is called both in the > thenb > & else branch. The structure does not make sense anymore. I'm not sure > this changed was needed. > > * getLatencyUsed: declared "double" so "return 0.0". > > * typo: ruin -> run; probably others, I did not check for them in > detail. Oh, thanks, my mistakes( > TAP Tests > ========= > > On my laptop, tests last 5.5 seconds before the patch, and about 13 > seconds > after. This is much too large. Pgbench TAP tests do not deserve to take > over > twice as much time as before just on this patch. > > One reason which explains this large time is there is a new script > with a new created instance. I'd suggest to append tests to the > existing 2 scripts, depending on whether they need a running instance > or not. Ok! All new tests that do not need a running instance are already added to the file 002_pgbench_no_server.pl. > Secondly, I think that the design of the tests are too heavy. For such > a feature, ISTM enough to check that it works, i.e. one test for > deadlocks (trigger one or a few deadlocks), idem for serializable, > maybe idem for other errors if any. > > <...> > > The latency limit to 900 ms try is a bad idea because it takes a lot of > time. > I did such tests before and they were removed by Tom Lane because of > determinism > and time issues. I would comment this test out for now. Ok! If it doesn't bother you - can you tell more about the causes of these determinism issues?.. Tests for some other failures that cannot be retried are already added to 001_pgbench_with_server.pl. > The challenge is to do that reliably and efficiently, i.e. so that the > test does > not rely on chance and is still quite efficient. > > The trick you use is to run an interactive psql in parallel to pgbench > so as to > play with concurrent locks. That is interesting, but deserves more > comments > and explanatation, eg before the test functions. > > Maybe this could be achieved within pgbench by using some wait stuff > in PL/pgSQL so that concurrent client can wait one another based on > data in unlogged table updated by a CALL within an "embedded" > transactions? Not sure. > > <...> > > Anyway, TAP tests should be much lighter (in total time), and if > possible much simpler. I'll try, thank you.. > Otherwise, maybe (simple) pgbench-side thread > barrier could help, but this would require more thinking. Tests must pass if we use --disable-thread-safety.. > Documentation > ============= > > Not looked at in much details for now. Just a few comments: > > Having the "most important settings" on line 1-6 and 8 (i.e. skipping > 7) looks > silly. The important ones should simply be the first ones, and the 8th > is not > that important, or it is in 7th position. Ok! > I do not understand why there is so much text about in failed sql > transaction > stuff, while we are mainly interested in serialization & deadlock > errors, and > this only falls in some "other" category. There seems to be more > details about > other errors that about deadlocks & serializable errors. > > The reporting should focus on what is of interest, either all errors, > or some > detailed split of these errors. > > <...> > > * "errors_in_failed_tx" is some subcounter of "errors", for a special > case. Why it is there fails me [I finally understood, and I think it > should be removed, see end of review]. If we wanted to distinguish, > then we should distinguish homogeneously: maybe just count the > different error types, eg have things like "deadlock_errors", > "serializable_errors", "other_errors", "internal_pgbench_errors" which > would be orthogonal one to the other, and "errors" could be recomputed > from these. Thank you, I agree with you. Unfortunately each new error type adds a new 1 or 2 columns of maximum width 20 to the per-statement report (to report errors and possibly retries of this type in this statement) and we already have 2 new columns for all errors and retries. So I'm not sure that we need add anything other than statistics only about all the errors and all the retries in general. > The documentation should state clearly what > are the counted errors, and then what are their effects on the reported > stats. > The "Errors and Serialization/Deadlock Retries" section is a good start > in that > direction, but it does not talk about pgbench internal errors (eg > "cos(true)"). > I think it should more explicit about errors. Thank you, I'll try to improve it. > Option --max-tries default value should be spelled out in the doc. If you mean that it is set to 1 if neither of the options --max-tries or --latency-limit is explicitly used, I'll fix this. > "Client's run is aborted", do you mean "Pgbench run is aborted"? No, other clients continue their run as usual. > * FailureStatus states are not homogeneously named. I'd suggest to use > *_FAILURE for all cases. The miscellaneous case should probably be the > last. I do not understand the distinction between ANOTHER_FAILURE & > IN_FAILED_SQL_TRANSACTION. Why should it be needed? [again, see and of > review] > > <...> > > "If a failed transaction block does not terminate in the current > script": > this just looks like a very bad idea, and explains my general ranting > above about this error condition. ISTM that the only reasonable option > is that a pgbench script should be inforced as a transaction, or a set > of > transactions, but cannot be a "piece" of transaction, i.e. pgbench > script > with "BEGIN;" but without a corresponding "COMMIT" is a user error and > warrants an abort, so that there is no need to manage these "in aborted > transaction" errors every where and report about them and document them > extensively. > > This means adding a check when a script is finished or starting that > PQtransactionStatus(const PGconn *conn) == PQTRANS_IDLE, and abort if > not > with a fatal error. Then we can forget about these "in tx errors" > counting, > reporting and so on, and just have to document the restriction. Ok! [1] https://www.postgresql.org/message-id/alpine.DEB.2.20.1801031720270.20034%40lancre [2] https://www.postgresql.org/message-id/e4c5e8cefa4a8e88f1273b0f1ee29e56@postgrespro.ru -- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
pgsql-hackers by date: