On 02.06.2025 19:25, Alexander Korotkov wrote:
> On Tue, May 13, 2025 at 12:49 PM Alena Rybakina
> <a.rybakina@postgrespro.ru> wrote:
>> On 12.05.2025 08:30, Amit Kapila wrote:
>>> On Fri, May 9, 2025 at 5:34 PM Alena Rybakina <a.rybakina@postgrespro.ru> wrote:
>>>> I did a rebase and finished the part with storing statistics separately from the relation statistics - now it is
possibleto disable the collection of statistics for relationsh using gucs and
>>>> this allows us to solve the problem with the memory consumed.
>>>>
>>> I think this patch is trying to collect data similar to what we do for
>>> pg_stat_statements for SQL statements. So, can't we follow a similar
>>> idea such that these additional statistics will be collected once some
>>> external module like pg_stat_statements is enabled? That module should
>>> be responsible for accumulating and resetting the data, so we won't
>>> have this memory consumption issue.
>> The idea is good, it will require one hook for the pgstat_report_vacuum
>> function, the extvac_stats_start and extvac_stats_end functions can be
>> run if the extension is loaded, so as not to add more hooks.
> +1
> Nice idea of a hook. Given the volume of the patch, it might be a
> good idea to keep this as an extension.
Today, I finalized the vacuum statistics separation approach and
refactored the vacuum statistics structures (patch 4).
I also reworked the table statistics to avoid mixing index statistics in
parallel vacuum mode (patch 2).
The new approach excludes buffer usage and WAL statistics for indexes
from the table’s statistics.
For timing, if vacuuming is sequential, the total time spent on all
indexes is subtracted from the table’s total vacuum time by adding up
the individual index vacuum times. If vacuuming is parallel, the total
index vacuum time is subtracted as a whole.
static void
accumulate_idxs_vacuum_statistics(LVRelState *vacrel, ExtVacReport
*extVacIdxStats)
{
if (!pgstat_track_vacuum_statistics)
return;
/* Fill heap-specific extended stats fields */
vacrel->extVacReportIdx.blk_read_time += extVacIdxStats->blk_read_time;
vacrel->extVacReportIdx.blk_write_time +=
extVacIdxStats->blk_write_time;
vacrel->extVacReportIdx.total_blks_dirtied +=
extVacIdxStats->total_blks_dirtied;
vacrel->extVacReportIdx.total_blks_hit +=
extVacIdxStats->total_blks_hit;
vacrel->extVacReportIdx.total_blks_read +=
extVacIdxStats->total_blks_read;
vacrel->extVacReportIdx.total_blks_written +=
extVacIdxStats->total_blks_written;
vacrel->extVacReportIdx.wal_bytes += extVacIdxStats->wal_bytes;
vacrel->extVacReportIdx.wal_fpi += extVacIdxStats->wal_fpi;
vacrel->extVacReportIdx.wal_records += extVacIdxStats->wal_records;
vacrel->extVacReportIdx.delay_time += extVacIdxStats->delay_time;
vacrel->extVacReportIdx.total_time += extVacIdxStats->total_time;
}
if (ParallelVacuumIsActive(vacrel))
{
LVExtStatCounters counters;
ExtVacReport extVacReport;
memset(&extVacReport, 0, sizeof(ExtVacReport));
extvac_stats_start(vacrel->rel, &counters);
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
vacrel->num_index_scans);
extvac_stats_end(vacrel->rel, &counters, &extVacReport);
accumulate_idxs_vacuum_statistics(vacrel, &extVacReport);
}
Currently, database statistics work incorrectly — I'm investigating the
issue.
In parallel, I'm starting work on the extension.
--
Regards,
Alena Rybakina
Postgres Professional