Re: POC: Parallel processing of indexes in autovacuum - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: POC: Parallel processing of indexes in autovacuum |
Date | |
Msg-id | CAD21AoDadzAwibxf-+urjx=XL+eVu8=Ut-Lh2GxXUt32LbPG3Q@mail.gmail.com Whole thread Raw |
In response to | Re: POC: Parallel processing of indexes in autovacuum (Daniil Davydov <3danissimo@gmail.com>) |
List | pgsql-hackers |
On Mon, Jul 21, 2025 at 11:45 PM Daniil Davydov <3danissimo@gmail.com> wrote: > > Hi, > > On Mon, Jul 21, 2025 at 11:40 PM Sami Imseih <samimseih@gmail.com> wrote: > > > > I have only reviewed the v8-0001-Parallel-index-autovacuum.patch so far and > > have a few comments from my initial pass. > > > > 1/ Please run pgindent. > > OK, I'll do it. > > > 2/ Documentation is missing. There may be more, but here are the places I > > found that likely need updates for the new behavior, reloptions, GUC, etc. > > Including docs in the patch early would help clarify expected behavior. > > > > https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-BASICS > > https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM > > https://www.postgresql.org/docs/current/runtime-config-autovacuum.html > > https://www.postgresql.org/docs/current/sql-createtable.html > > https://www.postgresql.org/docs/current/sql-altertable.html > > https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES > > https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-PARALLEL-MAINTENANCE-WORKERS > > > > Thanks for gathering it all together. I'll update the documentation so > it will reflect changes in autovacuum daemon, reloptions and GUC > parameters. So far, I don't see what we can add to vacuum-basics > and alter-table paragraphs. > > I'll create separate .patch file for changes in documentation. > > > One thing I am unclear on is the interaction between max_worker_processes, > > max_parallel_workers, and max_parallel_maintenance_workers. For example, does > > the following change mean that manual VACUUM PARALLEL is no longer capped by > > max_parallel_maintenance_workers? > > > > @@ -597,8 +614,8 @@ parallel_vacuum_compute_workers(Relation *indrels, > > int nindexes, int nrequested, > > parallel_workers = (nrequested > 0) ? > > Min(nrequested, nindexes_parallel) : nindexes_parallel; > > > > - /* Cap by max_parallel_maintenance_workers */ > > - parallel_workers = Min(parallel_workers, > > max_parallel_maintenance_workers); > > + /* Cap by GUC variable */ > > + parallel_workers = Min(parallel_workers, max_parallel_workers); > > > > Oh, it is my poor choice of a name for a local variable (I'll rename it). > This variable can get different values depending on performed operation : > autovacuum_max_parallel_workers for parallel autovacuum and > max_parallel_maintenance_workers for maintenance VACUUM. > > > > > 3/ Shouldn't this be "max_parallel_workers" instead of "bgworkers pool" ? > > > > + "autovacuum_parallel_workers", > > + "Maximum number of parallel autovacuum workers > > that can be taken from bgworkers pool for processing this table. " > > > > I don't think that we should refer to max_parallel_workers here. > Actually, this reloption doesn't depend on max_parallel_workers at all. > I wrote about bgworkers pool (both here and in description of > autovacuum_max_parallel_workers parameter) in order to clarify that > parallel autovacuum will use dynamic workers instead of launching > more a/v workers. > > BTW, I don't really like that the comment on this option turns out to be > very large. I'll leave only short description in reloptions.c and move > clarification about zero value in rel.h > Mentions of bgworkers pool will remain only in > description of autovacuum_max_parallel_workers. > > > 4/ The comment "When parallel autovacuum worker die" suggests an abnormal > > exit. "Terminates" seems clearer, since this applies to both normal and > > abnormal exits. > > > > instead of: > > + * When parallel autovacuum worker die, > > > > how about this: > > * When parallel autovacuum worker terminates, > > > > Sounds reasonable, I'll fix it. > > > > > 5/ Any reason AutoVacuumReleaseParallelWorkers cannot be called before > > DestroyParallelContext? > > > > + nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember > > this value */ > > DestroyParallelContext(pvs->pcxt); > > + > > + /* Release all launched (i.e. reserved) parallel autovacuum workers. */ > > + if (AmAutoVacuumWorkerProcess()) > > + AutoVacuumReleaseParallelWorkers(nlaunched_workers); > > > > I wrote about it above [1], but I think I can duplicate my thoughts here : > """ > Destroying parallel context includes waiting for all workers to exit (after > which, other operations can use them). > If we first call ParallelAutoVacuumReleaseWorkers, some operation can > reasonably request all released workers. But this request can fail, > because there is no guarantee that workers managed to finish. > > Actually, there's nothing wrong with that, but I think releasing workers > only after finishing work is a more logical approach. > """ > > > > > 6/ Also, would it be cleaner to move AmAutoVacuumWorkerProcess() inside > > AutoVacuumReleaseParallelWorkers()? > > > > if (!AmAutoVacuumWorkerProcess()) > > return; > > > > It seems to me that the opposite is true. If there is no alternative to calling > AmAutoVacuumWorkerProcess, it might confuse somebody. All doubts > will disappear after viewing the AmAutoVacuumWorkerProcess code, > but IMO code in vacuumparallel.c will become less intuitive. > > > 7/ It looks like the psql tab completion for autovacuum_parallel_workers is > > missing: > > > > test=# alter table t set (autovacuum_ > > autovacuum_analyze_scale_factor > > autovacuum_analyze_threshold > > autovacuum_enabled > > autovacuum_freeze_max_age > > autovacuum_freeze_min_age > > autovacuum_freeze_table_age > > autovacuum_multixact_freeze_max_age > > autovacuum_multixact_freeze_min_age > > autovacuum_multixact_freeze_table_age > > autovacuum_vacuum_cost_delay > > autovacuum_vacuum_cost_limit > > autovacuum_vacuum_insert_scale_factor > > autovacuum_vacuum_insert_threshold > > autovacuum_vacuum_max_threshold > > autovacuum_vacuum_scale_factor > > autovacuum_vacuum_threshold > > > > Good catch, I'll fix it. > > Thank you for the review! Please, see v9 patches : > 1) Run pgindent + rebase patches on newest commit in master. > 2) Introduce changes for documentation. > 3) Rename local variable in parallel_vacuum_compute_workers. > 4) Shorten the description of autovacuum_parallel_workers in > reloptions.c (move clarifications for it into rel.h). > 5) Reword "When parallel autovacuum worker die" comment. > 6) Add tab completion for autovacuum_parallel_workers table option. Thank you for updating the patch. Here are some review comments. + /* Release all launched (i.e. reserved) parallel autovacuum workers. */ + if (AmAutoVacuumWorkerProcess()) + AutoVacuumReleaseParallelWorkers(nlaunched_workers); + We release the reserved worker in parallel_vacuum_end(). However, parallel_vacuum_end() is called only once at the end of vacuum. I think we need to release the reserved worker after index vacuuming or cleanup, otherwise we would end up holding the reserved workers until the end of vacuum even if we invoke index vacuuming multiple times. --- +void +assign_autovacuum_max_parallel_workers(int newval, void *extra) +{ + autovacuum_max_parallel_workers = Min(newval, max_worker_processes); +} I don't think we need the assign hook for this GUC parameter. We can internally cap the maximum value by max_worker_processes like other GUC parameters such as max_parallel_maintenance_workers and max_parallel_workers. ---+ /* Refresh autovacuum_max_parallel_workers paremeter */ + CHECK_FOR_INTERRUPTS(); + if (ConfigReloadPending) + { + ConfigReloadPending = false; + ProcessConfigFile(PGC_SIGHUP); + } + + LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE); + + /* + * If autovacuum_max_parallel_workers parameter was reduced during + * parallel autovacuum execution, we must cap available workers number by + * its new value. + */ + AutoVacuumShmem->av_freeParallelWorkers = + Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers, + autovacuum_max_parallel_workers); + + LWLockRelease(AutovacuumLock); I think another race condition could occur; suppose autovacuum_max_parallel_workers is set to '5' and one autovacuum worker reserved 5 workers, meaning that AutoVacuumShmem->av_freeParallelWorkers is 0. Then, the user changes autovacuum_max_parallel_workers to 3 and reloads the conf file right after the autovacuum worker checks the interruption. The launcher processes calls adjust_free_parallel_workers() but av_freeParallelWorkers remains 0, and the autovacuum worker increments it by 5 as its autovacuum_max_parallel_workers value is still 5. I think that we can have the autovacuum_max_parallel_workers value on shmem, and only the launcher process can modify its value if the GUC is changed. Autovacuum workers simply increase or decrease the av_freeParallelWorkers within the range of 0 and the autovacuum_max_parallel_workers value on shmem. When changing autovacuum_max_parallel_workers and av_freeParallelWorkers values on shmem, the launcher process calculates the number of workers reserved at that time and calculate the new av_freeParallelWorkers value by subtracting the new autovacuum_max_parallel_workers by the number of reserved workers. --- +AutoVacuumReserveParallelWorkers(int nworkers) +{ + int can_launch; How about renaming it to 'nreserved' or something? can_launch looks like it's a boolean variable to indicate whether the process can launch workers. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: