Re: Vacuum statistics - Mailing list pgsql-hackers

From Alena Rybakina
Subject Re: Vacuum statistics
Date
Msg-id 18169b68-5b10-40fd-9657-be04f2bd0161@postgrespro.ru
Whole thread Raw
In response to Re: Vacuum statistics  (Alexander Korotkov <aekorotkov@gmail.com>)
List pgsql-hackers
On 02.06.2025 19:25, Alexander Korotkov wrote:
> On Tue, May 13, 2025 at 12:49 PM Alena Rybakina
> <a.rybakina@postgrespro.ru> wrote:
>> On 12.05.2025 08:30, Amit Kapila wrote:
>>> On Fri, May 9, 2025 at 5:34 PM Alena Rybakina <a.rybakina@postgrespro.ru> wrote:
>>>> I did a rebase and finished the part with storing statistics separately from the relation statistics - now it is
possibleto disable the collection of statistics for relationsh using gucs and
 
>>>> this allows us to solve the problem with the memory consumed.
>>>>
>>> I think this patch is trying to collect data similar to what we do for
>>> pg_stat_statements for SQL statements. So, can't we follow a similar
>>> idea such that these additional statistics will be collected once some
>>> external module like pg_stat_statements is enabled? That module should
>>> be responsible for accumulating and resetting the data, so we won't
>>> have this memory consumption issue.
>> The idea is good, it will require one hook for the pgstat_report_vacuum
>> function, the extvac_stats_start and extvac_stats_end functions can be
>> run if the extension is loaded, so as not to add more hooks.
> +1
> Nice idea of a hook.  Given the volume of the patch, it might be a
> good idea to keep this as an extension.

Today, I finalized the vacuum statistics separation approach and 
refactored the vacuum statistics structures (patch 4).

I also reworked the table statistics to avoid mixing index statistics in 
parallel vacuum mode (patch 2).

The new approach excludes buffer usage and WAL statistics for indexes 
from the table’s statistics.
For timing, if vacuuming is sequential, the total time spent on all 
indexes is subtracted from the table’s total vacuum time by adding up 
the individual index vacuum times. If vacuuming is parallel, the total 
index vacuum time is subtracted as a whole.

static void
accumulate_idxs_vacuum_statistics(LVRelState *vacrel, ExtVacReport 
*extVacIdxStats)
{
     if (!pgstat_track_vacuum_statistics)
         return;

     /* Fill heap-specific extended stats fields */
     vacrel->extVacReportIdx.blk_read_time += extVacIdxStats->blk_read_time;
     vacrel->extVacReportIdx.blk_write_time += 
extVacIdxStats->blk_write_time;
     vacrel->extVacReportIdx.total_blks_dirtied += 
extVacIdxStats->total_blks_dirtied;
     vacrel->extVacReportIdx.total_blks_hit += 
extVacIdxStats->total_blks_hit;
     vacrel->extVacReportIdx.total_blks_read += 
extVacIdxStats->total_blks_read;
     vacrel->extVacReportIdx.total_blks_written += 
extVacIdxStats->total_blks_written;
     vacrel->extVacReportIdx.wal_bytes += extVacIdxStats->wal_bytes;
     vacrel->extVacReportIdx.wal_fpi += extVacIdxStats->wal_fpi;
     vacrel->extVacReportIdx.wal_records += extVacIdxStats->wal_records;
     vacrel->extVacReportIdx.delay_time += extVacIdxStats->delay_time;

     vacrel->extVacReportIdx.total_time += extVacIdxStats->total_time;

}

if (ParallelVacuumIsActive(vacrel))
{
     LVExtStatCounters counters;
     ExtVacReport extVacReport;

     memset(&extVacReport, 0, sizeof(ExtVacReport));

     extvac_stats_start(vacrel->rel, &counters);

     /* Outsource everything to parallel variant */
     parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
vacrel->num_index_scans);

     extvac_stats_end(vacrel->rel, &counters, &extVacReport);
     accumulate_idxs_vacuum_statistics(vacrel, &extVacReport);
}

Currently, database statistics work incorrectly — I'm investigating the 
issue.


In parallel, I'm starting work on the extension.

-- 
Regards,
Alena Rybakina
Postgres Professional

Attachment

pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: pg_upgrade: warn about roles with md5 passwords
Next
From: Nathan Bossart
Date:
Subject: Re: pg_get_multixact_members not documented