Re: Progress reporting for pg_verify_checksums - Mailing list pgsql-hackers
From | Michael Banck |
---|---|
Subject | Re: Progress reporting for pg_verify_checksums |
Date | |
Msg-id | 20180919211103.GB23519@nighthawk.caipicrew.dd-dns.de Whole thread Raw |
In response to | Re: Progress reporting for pg_verify_checksums (Fabien COELHO <coelho@cri.ensmp.fr>) |
Responses |
Re: Progress reporting for pg_verify_checksums
|
List | pgsql-hackers |
Hi, thanks for the review! On Wed, Sep 19, 2018 at 05:17:05PM +0200, Fabien COELHO wrote: > >>This optionally prints the progress of pg_verify_checksums via read > >>kilobytes to the terminal with the new command line argument -P. > >> > >>If -P was forgotten and pg_verify_checksums operates on a large cluster, > >>the caller can send SIGUSR1 to pg_verify_checksums to turn progress > >>status reporting on during runtime. > > > >Version 2 of the patch is attached. This is rebased to master as of > >422952ee and changes the signal handling to be a toggle as suggested by > >Alvaro. > > Patch applies cleanly and compiles. > > About tests: "make check" is okay, but ITSM that the command is not started > once, ever, in any test:-( It is unclear whether any test triggers checksums > anyway... Yeah, this was discussed on another thread and some basic tap tests for pg_verify_checksums were proposed (also by me), but this hasn't been further addressed. Personally, I think this would be a good thing to add to v11 even. In any case, this particular feature might not be very easy to tap test, but I am open to suggestions, of course. > The time() granularity to the second makes the result awkward on small > tests: > > 8/1554552 kB (0%, 8 kB/s) > 1040856/1554552 kB (66%, 1040856 kB/s) > 1554552/1554552 kB (100%, 1554552 kB/s) > > I'd suggest to reuse the "portability/instr_time.h" stuff which allows a > much better precision. > > The "kB" kilo-byte symbols stands for 1000 bytes, but you are displaying > 1024-bytes kilos, which symbol is either "KiB" or "KB". Given that we are > about storage which is usually measured in powers of 1,000, so I'd suggest > to use thousands. > > Another reserve I have is that on any hardware it is likely that there will > be a lot of kilos, so maybe megas (MB) should be used instead. The display is exactly the same as in pg_basebackup (except the bandwith is shown as well), so right now I think it is more useful to be consistent here. So if we change that, pg_basebackup should be changed as well I think. Maybe the code could be factored out to some common file in the future. > I'm wondering whether the bandwidth display should be local (from the last > display) or global (from the start of the command), but for the last forced > one. This is an open question. > Maybe it would be nice to show elapsed time and expected completion time > based on the current speed. Maybe; this could be added to the factored out common code I mentioned above. > I would be in favor or having this turned on by default, and silenced with > some option. I'm not sure other people would agree, though, so it is an open > question as well. If this runs in a script, it would be pretty annoying, so everybody would have to add --no-progress so I am not convinced. Also, both pg_basebackup and pgbench don't show progress by default. > About the code: > > + if (show_progress) > + show_progress = false; > + else > + show_progress = true; > > Why not a simple: > > /* invert current state */ > show_progress = !show_progress; Fair enough. > I do not see much advantage in using intermediate string buffers for values: > > + snprintf(totalstr, sizeof(totalstr), INT64_FORMAT, > + total_size / 1024); > + snprintf(currentstr, sizeof(currentstr), INT64_FORMAT, > + current_size / 1024); > + snprintf(currspeedstr, sizeof(currspeedstr), INT64_FORMAT, > + (current_size / 1024) / (((time(NULL) - scan_started) == 0) ? 1 : (time(NULL) - scan_started))); > + fprintf(stderr, "%s/%s kB (%d%%, %s kB/s)", > + currentstr, totalstr, total_percent, currspeedstr); > > ISTM that just one simple fprintf would be fine: > > fprintf(stderr, INT64_FORMAT "/" INT64_FORMAT " ...", > formulas for each slot...); That code is adopted from pg_basebackup.c and the comment there says: | * Separate step to keep platform-dependent format code out of | * translatable strings. And we only test for INT64_FORMAT | * availability in snprintf, not fprintf. So that sounds legit to me. > ISTM that this line overwriting trick deserves a comment: > > + if (isatty(fileno(stderr))) > + fprintf(stderr, "\r"); > + else > + fprintf(stderr, "\n"); I agree a comment should be in there. But that code is also taken verbatim from pg_basebackup.c (but this time, there's no comment there, either). How about this: | * If we are reporting to a terminal, send a carriage return so that | * we stay on the same line. If not, send a newline. > And it runs a little amok with "-v". Do you suggest we should make those mutually exlusive? I agree that in general, -P is not very useful if -v is on, but if you have a really big table, it should still be, no? > + memset(&act, '\0', sizeof(act)); > > pg uses 0 instead of '\0' everywhere else. Ok. > + /* Make sure we just report at least once a second */ > + if ((now == last_progress_update) && !force) return; > > The comments seems contradictory, I understand the code makes sure that it > is "at most" once a second. I guess you're right as the identical code in pg_basebackup.c has a comment saying "Max once per second". > Pgbench uses "-P XXX" to strigger a progress > report every XXX second. I'm unsure whether it would be useful to allow the > user to change the 1 second display interval. I think pgbench writes a new line for each report? In that case, having it every second for longer runs might be annoying and I can see the point in having in configurable. In the case of pg_basebackup/pg_verify_checksums, I guess 1 second is fine. > I'm not sure why you removed one unrelated line: > > #include "storage/checksum.h" > #include "storage/checksum_impl.h" > > - > static int64 files = 0; > static int64 blocks = 0; > static int64 badblocks = 0; Merge error on my side I guess, will fix. > There is a problem in the scan_file code: The added sizeonly parameter is > not used. It should be removed. Right, this might have been a leftover from an earlier version of the code, I'll check back with Bernd to make sure that was not a porting/merge error on my side. I've attached V3 of the patch, addressing some of your comments. Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.banck@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz
Attachment
pgsql-hackers by date: