Re: Should vacuum process config file reload more often - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Should vacuum process config file reload more often |
Date | |
Msg-id | CAD21AoC=BRMwUpVDf1tzDa13--dLhuq=c2XQ_KAYA4Qx3cS=kA@mail.gmail.com Whole thread Raw |
In response to | Re: Should vacuum process config file reload more often (Melanie Plageman <melanieplageman@gmail.com>) |
Responses |
Re: Should vacuum process config file reload more often
|
List | pgsql-hackers |
On Sat, Mar 11, 2023 at 8:11 AM Melanie Plageman <melanieplageman@gmail.com> wrote: > > Quotes below are combined from two of Sawada-san's emails. > > I've also attached a patch with my suggested current version. > > On Thu, Mar 9, 2023 at 10:27 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Fri, Mar 10, 2023 at 11:23 AM Melanie Plageman > > <melanieplageman@gmail.com> wrote: > > > > > > On Tue, Mar 7, 2023 at 12:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > On Mon, Mar 6, 2023 at 5:26 AM Melanie Plageman > > > > <melanieplageman@gmail.com> wrote: > > > > > > > > > > On Thu, Mar 2, 2023 at 6:37 PM Melanie Plageman > > > > > In this version I've removed wi_cost_delay from WorkerInfoData. There is > > > > > no synchronization of cost_delay amongst workers, so there is no reason > > > > > to keep it in shared memory. > > > > > > > > > > One consequence of not updating VacuumCostDelay from wi_cost_delay is > > > > > that we have to have a way to keep track of whether or not autovacuum > > > > > table options are in use. > > > > > > > > > > This patch does this in a cringeworthy way. I added two global > > > > > variables, one to track whether or not cost delay table options are in > > > > > use and the other to store the value of the table option cost delay. I > > > > > didn't want to use a single variable with a special value to indicate > > > > > that table option cost delay is in use because > > > > > autovacuum_vacuum_cost_delay already has special values that mean > > > > > certain things. My code needs a better solution. > > > > > > > > While it's true that wi_cost_delay doesn't need to be shared, it seems > > > > to make the logic somewhat complex. We need to handle cost_delay in a > > > > different way from other vacuum-related parameters and we need to make > > > > sure av[_use]_table_option_cost_delay are set properly. Removing > > > > wi_cost_delay from WorkerInfoData saves 8 bytes shared memory per > > > > autovacuum worker but it might be worth considering to keep > > > > wi_cost_delay for simplicity. > > > > > > Ah, it turns out we can't really remove wi_cost_delay from WorkerInfo > > > anyway because the launcher doesn't know anything about table options > > > and so the workers have to keep an updated wi_cost_delay that the > > > launcher or other autovac workers who are not vacuuming that table can > > > read from when calculating the new limit in autovac_balance_cost(). > > > > IIUC if any of the cost delay parameters has been set individually, > > the autovacuum worker is excluded from the balance algorithm. > > Ah, yes! That's right. So it is not a problem. Then I still think > removing wi_cost_delay from the worker info makes sense. wi_cost_delay > is a double and can't easily be accessed atomically the way > wi_cost_limit can be. > > Keeping the cost delay local to the backends also makes it clear that > cost delay is not something that should be written to by other backends > or that can differ from worker to worker. Without table options in the > picture, the cost delay should be the same for any worker who has > reloaded the config file. Agreed. > > As for the cost limit safe access issue, maybe we can avoid a LWLock > acquisition for reading wi_cost_limit by using an atomic similar to what > you suggested here for "did_rebalance". > > > > I've added in a shared lock for reading from wi_cost_limit in this > > > patch. However, AutoVacuumUpdateLimit() is called unconditionally in > > > vacuum_delay_point(), which is called quite often (per block-ish), so I > > > was trying to think if there is a way we could avoid having to check > > > this shared memory variable on every call to vacuum_delay_point(). > > > Rebalances shouldn't happen very often (done by the launcher when a new > > > worker is launched and by workers between vacuuming tables). Maybe we > > > can read from it less frequently? > > > > Yeah, acquiring the lwlock for every call to vacuum_delay_point() > > seems to be harmful. One idea would be to have one sig_atomic_t > > variable in WorkerInfoData and autovac_balance_cost() set it to true > > after rebalancing the worker's cost-limit. The worker can check it > > without locking and update its delay parameters if the flag is true. > > Instead of having the atomic indicate whether or not someone (launcher > or another worker) did a rebalance, it would simply store the current > cost limit. Then the worker can normally access it with a simple read. > > My rationale is that if we used an atomic to indicate whether or not we > did a rebalance ("did_rebalance"), we would have the same cache > coherency guarantees as if we just used the atomic for the cost limit. > If we read from the "did_rebalance" variable and missed someone having > written to it on another core, we still wouldn't get around to checking > the wi_cost_limit variable in shared memory, so it doesn't matter that > we bothered to keep it in shared memory and use a lock to access it. > > I noticed we don't allow wi_cost_limit to ever be less than 0, so we > could store wi_cost_limit in an atomic uint32. > > I'm not sure if it is okay to do pg_atomic_read_u32() and > pg_atomic_unlocked_write_u32() or if we need pg_atomic_write_u32() in > most cases. I agree to use pg_atomic_uin32. Given that the comment of pg_atomic_unlocked_write_u32() says: * pg_atomic_compare_exchange_u32. This should only be used in cases where * minor performance regressions due to atomics emulation are unacceptable. I think pg_atomic_write_u32() is enough for our use case. > > I've implemented the atomic cost limit in the attached patch. Though, > I'm pretty unsure about how I initialized the atomics in > AutoVacuumShmemInit()... + /* initialize the WorkerInfo free list */ for (i = 0; i < autovacuum_max_workers; i++) dlist_push_head(&AutoVacuumShmem->av_freeWorkers, &worker[i].wi_links); + + dlist_foreach(iter, &AutoVacuumShmem->av_freeWorkers) + pg_atomic_init_u32( + &(dlist_container(WorkerInfoData, wi_links, iter.cur))->wi_cost_limit, + 0); + I think we can do like: /* initialize the WorkerInfo free list */ for (i = 0; i < autovacuum_max_workers; i++) { dlist_push_head(&AutoVacuumShmem->av_freeWorkers, &worker[i].wi_links); pg_atomic_init_u32(&(worker[i].wi_cost_limit)); } > > If the consensus is that it is simply too confusing to take > wi_cost_delay out of WorkerInfo, we might be able to afford using a > shared lock to access it because we won't call AutoVacuumUpdateDelay() > on every invocation of vacuum_delay_point() -- only when we've reloaded > the config file. > > One potential option to avoid taking a shared lock on every call to > AutoVacuumUpdateDelay() is to set a global variable to indicate that we > did update it (since we are the only ones updating it) and then only > take the shared LWLock in AutoVacuumUpdateDelay() if that flag is true. > If we remove wi_cost_delay from WorkerInfo, probably we don't need to acquire the lwlock in AutoVacuumUpdateDelay()? The shared field we access in that function will be only wi_dobalance, but this field is updated only by its owner autovacuum worker. > > --- > > void > > AutoVacuumUpdateDelay(void) > > { > > - if (MyWorkerInfo) > > + /* > > + * We are using autovacuum-related GUCs to update > > VacuumCostDelay, so we > > + * only want autovacuum workers and autovacuum launcher to do this. > > + */ > > + if (!(am_autovacuum_worker || am_autovacuum_launcher)) > > + return; > > > > Is there any case where the autovacuum launcher calls > > AutoVacuumUpdateDelay() function? > > I had meant to add it to HandleAutoVacLauncherInterrupts() after > reloading the config file (done in attached patch). When using the > global variables for cost delay (instead of wi_cost_delay in worker > info), the autovac launcher also has to do the check in the else branch > of AutoVacuumUpdateDelay() > > VacuumCostDelay = autovacuum_vac_cost_delay >= 0 ? > autovacuum_vac_cost_delay : VacuumCostDelay; > > to make sure VacuumCostDelay is correct for when it calls > autovac_balance_cost(). But doesn't the launcher do a similar thing at the beginning of autovac_balance_cost()? double vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ? autovacuum_vac_cost_delay : VacuumCostDelay); Related to this point, I think autovac_balance_cost() should use globally-set cost_limit and cost_delay values to calculate worker's vacuum-delay parameters. IOW, vac_cost_limit and vac_cost_delay should come from the config file setting, not table option etc: int vac_cost_limit = (autovacuum_vac_cost_limit > 0 ? autovacuum_vac_cost_limit : VacuumCostLimit); double vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ? autovacuum_vac_cost_delay : VacuumCostDelay); If my understanding is right, the following change is not right; AutoVacUpdateLimit() updates the VacuumCostLimit based on the value in MyWorkerInfo: MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit; + AutoVacuumUpdateLimit(); /* do a balance */ autovac_balance_cost(); - /* set the active cost parameters from the result of that */ - AutoVacuumUpdateDelay(); Also, even when using the global variables for cost delay, the launcher doesn't need to check the global variable. It should always be able to use either autovacuum_vac_cost_delay/limit or VacuumCostDelay/Limit. > > This also made me think about whether or not we still need cost_limit_base. > It is used to ensure that autovac_balance_cost() never ends up setting > workers' wi_cost_limits above the current autovacuum_vacuum_cost_limit > (or VacuumCostLimit). However, the launcher and all the workers should > know what the value is without cost_limit_base, no? Yeah, the current balancing algorithm looks to respect the cost_limit value set when starting to vacuum the table. The proportion of the amount of I/O that a worker can consume is calculated based on the base value and the new worker's cost_limit value cannot exceed the base value. Given that we're trying to dynamically tune worker's cost parameters (delay and limit), this concept seems to need to be updated. > > > > Also not sure how the patch interacts with failsafe autovac and parallel > > > vacuum. > > > > Good point. > > > > When entering the failsafe mode, we disable the vacuum delays (see > > lazy_check_wraparound_failsafe()). We need to keep disabling the > > vacuum delays even after reloading the config file. One idea is to > > have another global variable indicating we're in the failsafe mode. > > vacuum_delay_point() doesn't update VacuumCostActive if the flag is > > true. > > I think we might not need to do this. Other than in > lazy_check_wraparound_failsafe(), VacuumCostActive is only updated in > two places: > > 1) in vacuum() which autovacuum will call per table. And failsafe is > reset per table as well. > > 2) in vacuum_delay_point(), but, since VacuumCostActive will already be > false when we enter vacuum_delay_point() the next time after > lazy_check_wraparound_failsafe(), we won't set VacuumCostActive there. Indeed. But does it mean that there is no code path to turn vacuum-delay on, even when vacuum_cost_delay is updated from 0 to non-0? Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: