Re: POC: Parallel processing of indexes in autovacuum - Mailing list pgsql-hackers

From Daniil Davydov
Subject Re: POC: Parallel processing of indexes in autovacuum
Date
Msg-id CAJDiXgiGSpqMQSOx-cVO_LtcB5GWHBy9ph7oOR4ebbX8A==kgw@mail.gmail.com
Whole thread Raw
In response to Re: POC: Parallel processing of indexes in autovacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
Hi,
Thank you very much for your comments!
In this letter I'll answer both of your recent letters.

On Fri, Aug 8, 2025 at 6:38 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Thank you for updating the patch. Here are some review comments.
>
> +        /* Release all launched (i.e. reserved) parallel autovacuum workers. */
> +        if (AmAutoVacuumWorkerProcess())
> +                AutoVacuumReleaseParallelWorkers(nlaunched_workers);
> +
>
> We release the reserved worker in parallel_vacuum_end(). However,
> parallel_vacuum_end() is called only once at the end of vacuum. I
> think we need to release the reserved worker after index vacuuming or
> cleanup, otherwise we would end up holding the reserved workers until
> the end of vacuum even if we invoke index vacuuming multiple times.
>

Yep, you are right. It was easy to miss because typically the autovacuum
takes only one cycle to process a table. Since both index vacuum and
index cleanup uses the parallel_vacuum_process_all_indexes function,
I think that both releasing and reserving should be placed there.

> ---
> +void
> +assign_autovacuum_max_parallel_workers(int newval, void *extra)
> +{
> +        autovacuum_max_parallel_workers = Min(newval, max_worker_processes);
> +}
>
> I don't think we need the assign hook for this GUC parameter. We can
> internally cap the maximum value by max_worker_processes like other
> GUC parameters such as max_parallel_maintenance_workers and
> max_parallel_workers.

Ok, I get it - we don't want to give a configuration error for no serious
reason. Actually, we already internally capping
autovacuum_max_parallel_workers by max_worker_processes (inside
parallel_vacuum_compute_workers function). This is the same behavior
as max_parallel_maintenance_workers got.

I'll get rid of the assign hook and add one more capping inside autovacuum
shmem initialization : Since max_worker_processes is PGC_POSTMASTER
parameter, av_freeParallelWorkers must not exceed its value.

>
> ---+        /* Refresh autovacuum_max_parallel_workers paremeter */
> +        CHECK_FOR_INTERRUPTS();
> +        if (ConfigReloadPending)
> +        {
> +                ConfigReloadPending = false;
> +                ProcessConfigFile(PGC_SIGHUP);
> +        }
> +
> +        LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
> +
> +        /*
> +         * If autovacuum_max_parallel_workers parameter was reduced during
> +         * parallel autovacuum execution, we must cap available
> workers number by
> +         * its new value.
> +         */
> +        AutoVacuumShmem->av_freeParallelWorkers =
> +                Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
> +                        autovacuum_max_parallel_workers);
> +
> +        LWLockRelease(AutovacuumLock);
>
> I think another race condition could occur; suppose
> autovacuum_max_parallel_workers is set to '5' and one autovacuum
> worker reserved 5 workers, meaning that
> AutoVacuumShmem->av_freeParallelWorkers is 0. Then, the user changes
> autovacuum_max_parallel_workers to 3 and reloads the conf file right
> after the autovacuum worker checks the interruption. The launcher
> processes calls adjust_free_parallel_workers() but
> av_freeParallelWorkers remains 0, and the autovacuum worker increments
> it by 5 as its autovacuum_max_parallel_workers value is still 5.
>

I think this problem can be solved if we put AutovacuumLock acquiring
before processing the config file, but I understand that this is a bad way.

> I think that we can have the autovacuum_max_parallel_workers value on
> shmem, and only the launcher process can modify its value if the GUC
> is changed. Autovacuum workers simply increase or decrease the
> av_freeParallelWorkers within the range of 0 and the
> autovacuum_max_parallel_workers value on shmem. When changing
> autovacuum_max_parallel_workers and av_freeParallelWorkers values on
> shmem, the launcher process calculates the number of workers reserved
> at that time and calculate the new av_freeParallelWorkers value by
> subtracting the new autovacuum_max_parallel_workers by the number of
> reserved workers.
>

Good idea, I agree. Replacing the GUC parameter with the variable in shmem
leaves the current logic of free workers management unchanged. Essentially,
this  is the same solution as I described above, but we are holding lock not
during  config reloading, but during a simple value check. It makes much
more sense.

> ---
> +AutoVacuumReserveParallelWorkers(int nworkers)
> +{
> +   int         can_launch;
>
> How about renaming it to 'nreserved' or something? can_launch looks
> like it's a boolean variable to indicate whether the process can
> launch workers.
>

There are no objections.

On Fri, Aug 15, 2025 at 3:41 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> While testing the patch, I found there are other two problems:
>
> 1. when an autovacuum worker who reserved workers fails with an error,
> the reserved workers are not released. I think we need to ensure that
> all reserved workers are surely released at the end of vacuum even
> with an error.
>

Agree. I'll add a try/catch block to the parallel_vacuum_process_all_indexes
(the only place where we are reserving workers).

> 2. when an autovacuum worker (not parallel vacuum worker) who uses
> parallel vacuum gets SIGHUP, it errors out with the error message
> "parameter "max_stack_depth" cannot be set during a parallel
> operation". Autovacuum checks the configuration file reload in
> vacuum_delay_point(), and while reloading the configuration file, it
> attempts to set max_stack_depth in
> InitializeGUCOptionsFromEnvironment() (which is called by
> ProcessConfigFileInternal()). However, it cannot change
> max_stack_depth since the worker is in parallel mode but
> max_stack_depth doesn't have GUC_ALLOW_IN_PARALLEL flag. This doesn't
> happen in regular backends who are using parallel queries because they
> check the configuration file reload at the end of each SQL command.
>

Hm, this is a really serious problem. I see only two ways to solve it (both are
not really good) :
1)
Do not allow processing of the config file during parallel autovacuum
execution.
2)
Teach the autovacuum to enter parallel mode only during the index vacuum/cleanup
phase. I'm a bit wary about it, because the design says that we should
be in parallel
mode during the whole parallel operation. But actually, if we can make
sure that all
launched workers are exited, I don't see reasons, why can't we just
exit parallel mode
at the end of parallel_vacuum_process_all_indexes.

What do you think about it? By now, I haven't made any changes related
to this problem.

Again, thank you for the review. Please, see v10 patches (only 0001
has been changed) :
1) Reserve and release workers only inside parallel_vacuum_process_all_indexes.
2) Add try/catch block to the parallel_vacuum_process_all_indexes, so we can
release workers even after an error. This required adding a static
variable to account
for the total number of reserved workers (av_nworkers_reserved).
3) Cap autovacuum_max_parallel_workers by max_worker_processes only inside
autovacuum code. Assign hook has been removed.
4) Use shmem value for determining the maximum number of parallel autovacuum
workers (eliminate race condition between launcher and leader process).

--
Best regards,
Daniil Davydov

Attachment

pgsql-hackers by date:

Previous
From: Kirill Reshke
Date:
Subject: Re: ALTER DOMAIN ADD NOT NULL NOT VALID
Next
From: John Naylor
Date:
Subject: Re: GB18030-2022 Support in PostgreSQL