Re: [PROPOSAL] VACUUM Progress Checker. - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: [PROPOSAL] VACUUM Progress Checker. |
Date | |
Msg-id | CA+Tgmoa2nFBVGoy2w4CDwyYedSdV2+MUWAKVn92z5K6QPdYPhg@mail.gmail.com Whole thread Raw |
In response to | Re: [PROPOSAL] VACUUM Progress Checker. (Vinayak Pokale <vinpokale@gmail.com>) |
Responses |
Re: [PROPOSAL] VACUUM Progress Checker.
|
List | pgsql-hackers |
On Tue, Jan 26, 2016 at 11:37 PM, Vinayak Pokale <vinpokale@gmail.com> wrote: > Hi, > > Please find attached updated patch with an updated interface. Well, this isn't right. You've got this sort of thing: + scanned_index_pages += RelationGetNumberOfBlocks(Irel[i]); + /* Report progress to the statistics collector */ + pgstat_report_progress_update_message(0, progress_message); + pgstat_report_progress_update_counter(1, scanned_heap_pages); + pgstat_report_progress_update_counter(3, scanned_index_pages); + pgstat_report_progress_update_counter(4, vacrelstats->num_index_scans + 1); The point of having pgstat_report_progress_update_counter() is so that you can efficiently update a single counter without having to update everything, when only one counter has changed. But here you are calling this function a whole bunch of times in a row, which completely misses the point - if you are updating all the counters, it's more efficient to use an interface that does them all at once instead of one at a time. But there's a second problem here, too, which is that I think you've got this code in the wrong place. The second point of the pgstat_report_progress_update_counter interface is that this should be cheap enough that we can do it every time the counter changes. That's not what you are doing here. You're updating the counters at various points in the code and just pushing new values for all of them regardless of which ones have changed. I think you should find a way that you can update the value immediately at the exact moment it changes. If that seems like too much of a performance hit we can talk about it, but I think the value of this feature will be greatly weakened if users can't count on it to be fully and continuously up to the moment. When something gets stuck, you want to know where it's stuck, not approximately kinda where it's stuck. + if(!scan_all) + scanned_heap_pages = scanned_heap_pages + next_not_all_visible_block; I don't want to be too much of a stickler for details here, but it seems to me that this is an outright lie. The number of scanned pages does not go up when we decide to skip some pages, because scanning and skipping aren't the same thing. If we're only going to report one number here, it needs to be called something like "current heap page", and then you can just report blkno at the top of each iteration of lazy_scan_heap's main loop. If you want to report the numbers of scanned and skipped pages separately that'd be OK too, but you can't call it the number of scanned pages and then actually report a value that is not that. + /* + * Reporting vacuum progress to statistics collector + */ This patch doesn't report anything to the statistics collector, nor should it. Instead of making the SQL-visible function pg_stat_get_vacuum_progress(), I think it should be something more generic like pg_stat_get_command_progress(). Maybe VACUUM will be the only command that reports into that feature for right now, but I'd hope for us to change that pretty soon after we get the first patch committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: