Re: review: pgbench - aggregation of info written into log - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: review: pgbench - aggregation of info written into log |
Date | |
Msg-id | 50A03610.8050801@fuzzy.cz Whole thread Raw |
In response to | review: pgbench - aggregation of info written into log (Pavel Stehule <pavel.stehule@gmail.com>) |
Responses |
Re: review: pgbench - aggregation of info written into log
|
List | pgsql-hackers |
Hi, attached is a v4 of the patch. There are not many changes, mostly some simple tidying up, except for handling the Windows. After a bit more investigation, it seems to me that we can't really get the same behavior as in other systems - basically the timestamp is unavailable so we can't log the interval start. So it seems to me the best we can do is to disable this option on Windows (and this is done in the patch). So when you try to use "--aggregate-interval" on Win it will complain it's an unknown option. Now that I think of it, there might be a better solution that would not mimic the Linux/Unix behaviour exactly - we do have elapsed time since the start of the benchmark, so maybe we should use this instead of the timestamp? I mean on systems with reasonable gettimeofday we'd get 1345828501 5601 1542744 483552416 61 2573 1345828503 7884 1979812 565806736 60 1479 1345828505 7208 1979422 567277552 59 1391 1345828507 7685 1980268 569784714 60 1398 1345828509 7073 1979779 573489941 236 1411 but on Windows we'd get 0 5601 1542744 483552416 61 2573 2 7884 1979812 565806736 60 1479 4 7208 1979422 567277552 59 1391 6 7685 1980268 569784714 60 1398 8 7073 1979779 573489941 236 1411 i.e. the first column is "seconds since start of the test". Hmmmm, and maybe we could call 'gettimeofday' at the beginning, to get the timestamp of the test start (AFAIK the issue on Windows is the resolution, but it should be enough). Then we could add it up with the elapsed time and we'd get the same output as on Linux. And maybe this could be done in regular pgbench execution too? But I really need help from someone who knows Windows and can test this. Comments regarding Pavel's reviews are below: On 2.10.2012 20:08, Pavel Stehule wrote: > Hello > > I did review of pgbench patch > http://archives.postgresql.org/pgsql-hackers/2012-09/msg00737.php > > * this patch is cleanly patched > > * I had a problem with doc > > make[1]: Entering directory `/home/pavel/src/postgresql/doc/src/sgml' > openjade -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D . > -c /usr/share/sgml/docbook/dsssl-stylesheets/catalog -d stylesheet.dsl > -t sgml -i output-html -V html-index postgres.sgml > openjade:pgbench.sgml:767:8:E: document type does not allow element > "TITLE" here; missing one of "ABSTRACT", "AUTHORBLURB", "MSGSET", > "CALLOUTLIST", "ITEMIZEDLIST", "ORDEREDLIST", "SEGMENTEDLIST", > "VARIABLELIST", "CAUTION", "IMPORTANT", "NOTE", "TIP", "WARNING", > "FORMALPARA", "BLOCKQUOTE", "EQUATION", "EXAMPLE", "FIGURE", "TABLE", > "PROCEDURE", "SIDEBAR", "QANDASET", "REFSECT3" start-tag > make[1]: *** [HTML.index] Error 1 > make[1]: *** Deleting file `HTML.index' > make[1]: Leaving directory `/home/pavel/src/postgresql/doc/src/sgml Hmmm, I've never got the docs to build at all, all I do get is a heap of some SGML/DocBook related issues. Can you investigate a bit more where's the issue? I don't remember messing with the docs in a way that might cause this ... mostly copy'n'paste edits. > * pgbench is compiled without warnings > > * there is a few issues in source code > > + > + /* should we aggregate the results or not? */ > + if (use_log_agg) > + { > + > + /* are we still in the same interval? if yes, accumulate the > + * values (print them otherwise) */ > + if (agg->start_time + use_log_agg >= INSTR_TIME_GET_DOUBLE(now)) > + { > + Errr, so what are the issues here? > > * a real time range for aggregation is dynamic (pgbench is not > realtime application), so I am not sure if following constraint has > sense > > + if ((duration > 0) && (use_log_agg > 0) && (duration % use_log_agg != 0)) { > + fprintf(stderr, "duration (%d) must be a multiple of aggregation > interval (%d)\n", duration, use_log_agg); > + exit(1); > + } I'm not sure what might be the issue here either. If the test duration is set (using "-T" option), then this kicks in and says that the duration should be a multiple of aggregation interval. Seems like a sensible assumption to me. Or do you think this is check should be removed? > * it doesn't flush last aggregated data (it is mentioned by Tomas: > "Also, I've realized the last interval may not be logged at all - I'll > take a look into this in the next version of the patch."). And it can > be significant for higher values of "use_log_agg" Yes, and I'm still not sure how to fix this in practice. But I do have this on my TODO. > > * a name of variable "use_log_agg" is not good - maybe "log_agg_interval" ? I've changed it to agg_interval. regards Tomas
Attachment
pgsql-hackers by date: