Re: progress report for ANALYZE - Mailing list pgsql-hackers
From | Tatsuro Yamada |
---|---|
Subject | Re: progress report for ANALYZE |
Date | |
Msg-id | 0876b4fe-26fb-ca32-f179-c696fa3ddfec@nttcom.co.jp Whole thread Raw |
In response to | Re: progress report for ANALYZE (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Responses |
Re: progress report for ANALYZE
|
List | pgsql-hackers |
Hi Horiguchi-san! On 2019/07/11 19:56, Kyotaro Horiguchi wrote: > Hello. > > At Tue, 9 Jul 2019 17:38:44 +0900, Tatsuro Yamada <tatsuro.yamada.tf@nttcom.co.jp> wrote in <244cb241-168b-d6a9-c45f-a80c34cdc6ad@nttcom.co.jp> >> Hi Alvaro, Anthony, Julien and Robert, >> >> On 2019/07/09 3:47, Julien Rouhaud wrote: >>> On Mon, Jul 8, 2019 at 8:44 PM Robert Haas <robertmhaas@gmail.com> >>> wrote: >>>> >>>> On Mon, Jul 8, 2019 at 2:18 PM Alvaro Herrera >>>> <alvherre@2ndquadrant.com> wrote: >>>>> Yeah, I got the impression that that was determined to be the >>>>> desirable >>>>> behavior, so I made it do that, but I'm not really happy about it >>>>> either. We're not too late to change the CREATE INDEX behavior, but >>>>> let's discuss what is it that we want. >>>> >>>> I don't think I intended to make any such determination -- which >>>> commit do you think established this as the canonical behavior? >>>> >>>> I propose that once a field is set, we should leave it set until the >>>> end. >>> +1 >>> Note that this patch is already behaving like that if the table only >>> contains dead rows. > > +1 from me. > >> 3785|13599|postgres|16384|f|16384|analyzing complete|0|0 <-- Added and >> fixed. :) >> --------------------------------------------------------- > > I have some comments. > > + const int index[] = { > + PROGRESS_ANALYZE_PHASE, > + PROGRESS_ANALYZE_TOTAL_BLOCKS, > + PROGRESS_ANALYZE_BLOCKS_DONE > + }; > + const int64 val[] = { > + PROGRESS_ANALYZE_PHASE_ANALYSIS, > + 0, 0 > > Do you oppose to leaving the total/done blocks alone here:p? Thanks for your comments! I created a new patch based on your comments because Alvaro allowed me to work on revising the patch. :) Ah, I revised it to remove "0, 0". > + BlockNumber nblocks; > + double blksdone = 0; > > Why is it a double even though blksdone is of the same domain > with nblocks? And finally: > > + pgstat_progress_update_param(PROGRESS_ANALYZE_BLOCKS_DONE, > + ++blksdone); > > It is converted into int64. Fixed. But is it suitable to use BlockNumber instead int64? > + WHEN 2 THEN 'analyzing sample' > + WHEN 3 THEN 'analyzing sample (extended stats)' > > I think we should avoid parenthesized phrases as far as > not-necessary. That makes the column unnecessarily long. The > phase is internally called as "compute stats" so *I* would prefer > something like the followings: > > + WHEN 2 THEN 'computing stats' > + WHEN 3 THEN 'computing extended stats' > > > > + WHEN 4 THEN 'analyzing complete' > > And if you are intending by this that (as mentioned in the > documentation) "working to complete this analyze", this would be: > > + WHEN 4 THEN 'completing analyze' > + WHEN 4 THEN 'finalizing analyze' I have no strong opinion, so I changed the phase-names based on your suggestions like following: WHEN 2 THEN 'computing stats' WHEN 3 THEN 'computing extended stats' WHEN 4 THEN 'finalizing analyze' However, I'd like to get any comments from hackers to get a consensus about the names. > + <entry>Process ID of backend.</entry> > > of "the" backend. ? And period is not attached on all > descriptions consisting of a single sentence. > > + <entry>OID of the database to which this backend is connected.</entry> > + <entry>Name of the database to which this backend is connected.</entry> > > "database to which .. is connected" is phrased as "database this > backend is connected to" in pg_stat_acttivity. (Just for consistency) I checked the sentences on other views of progress monitor (VACUUM, CREATE INDEX and CLUSTER), and they are same sentence. Therefore, I'd like to keep it as is. :) > + <entry>Whether the current scan includes legacy inheritance children.</entry> > > This apparently excludes partition tables but actually it is > included. > > "Whether scanning through child tables" ? > > I'm not sure "child tables" is established as the word to mean > both "partition tables and inheritance children".. Hmm... I'm also not sure but I fixed as you suggested. > + The table being scanned (differs from <literal>relid</literal> > + only when processing partitions or inheritance children). > > Is <literal> needed? And I think the parentheses are not needed. > > OID of the table currently being scanned. Can differ from relid > when analyzing tables that have child tables. How about: OID of the table currently being scanned. It might be different from relid when analyzing tables that have child tables. > + Total number of heap blocks to scan in the current table. > > Number of heap blocks on scanning_table to scan? > > It might be better that this description describes that this > and the next column is meaningful only while the phase > "scanning table". How about: Total number of heap blocks in the scanning_table. > + The command is currently scanning the table. > > "sample(s)" comes somewhat abruptly in the next item. Something > like "The command is currently scanning the table > <structfield>scanning_table</structfield> to obtain samples" > might be better. Fixed. > + <command>ANALYZE</command> is currently extracting statistical data > + from the sample obtained. > > Something like "The command is computing stats from the samples > obtained in the previous phase" might be better. Fixed. Please find attached patch. :) Here is a test result of the patch. ========================================================== # select * from pg_stat_progress_analyze ; \watch 0.0001; 9067|13599|postgres|16387|f|16387|scanning table|443|14 9067|13599|postgres|16387|f|16387|scanning table|443|44 9067|13599|postgres|16387|f|16387|scanning table|443|76 9067|13599|postgres|16387|f|16387|scanning table|443|100 ... 9067|13599|postgres|16387|f|16387|scanning table|443|443 9067|13599|postgres|16387|f|16387|scanning table|443|443 9067|13599|postgres|16387|f|16387|scanning table|443|443 9067|13599|postgres|16387|f|16387|computing stats|443|443 9067|13599|postgres|16387|f|16387|computing stats|443|443 9067|13599|postgres|16387|f|16387|computing stats|443|443 9067|13599|postgres|16387|f|16387|finalizing analyze|443|443 ========================================================== Thanks, Tatsuro Yamada
Attachment
pgsql-hackers by date: