Re: Should vacuum process config file reload more often - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Should vacuum process config file reload more often |
Date | |
Msg-id | CAD21AoCn3-tAB1HgNXY-_B9yaRES0u2gyifw=ABchNV8FkswZg@mail.gmail.com Whole thread Raw |
In response to | Re: Should vacuum process config file reload more often (Melanie Plageman <melanieplageman@gmail.com>) |
Responses |
Re: Should vacuum process config file reload more often
|
List | pgsql-hackers |
On Fri, Mar 24, 2023 at 9:27 AM Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Thu, Mar 23, 2023 at 2:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Sun, Mar 19, 2023 at 7:47 AM Melanie Plageman <melanieplageman@gmail.com> wrote: > > Do we need to calculate the number of workers running with > > nworkers_for_balance by iterating over the running worker list? I > > guess autovacuum workers can increment/decrement it at the beginning > > and end of vacuum. > > I don't think we can do that because if a worker crashes, we have no way > of knowing if it had incremented or decremented the number, so we can't > adjust for it. What kind of crash are you concerned about? If a worker raises an ERROR, we can catch it in PG_CATCH() block. If it's a FATAL, we can do that in FreeWorkerInfo(). A PANIC error ends up crashing the entire server. > > > > > > > > Also not sure how the patch interacts with failsafe autovac and parallel > > > > > > > vacuum. > > > > > > > > > > > > Good point. > > > > > > > > > > > > When entering the failsafe mode, we disable the vacuum delays (see > > > > > > lazy_check_wraparound_failsafe()). We need to keep disabling the > > > > > > vacuum delays even after reloading the config file. One idea is to > > > > > > have another global variable indicating we're in the failsafe mode. > > > > > > vacuum_delay_point() doesn't update VacuumCostActive if the flag is > > > > > > true. > > > > > > > > > > I think we might not need to do this. Other than in > > > > > lazy_check_wraparound_failsafe(), VacuumCostActive is only updated in > > > > > two places: > > > > > > > > > > 1) in vacuum() which autovacuum will call per table. And failsafe is > > > > > reset per table as well. > > > > > > > > > > 2) in vacuum_delay_point(), but, since VacuumCostActive will already be > > > > > false when we enter vacuum_delay_point() the next time after > > > > > lazy_check_wraparound_failsafe(), we won't set VacuumCostActive there. > > > > > > > > Indeed. But does it mean that there is no code path to turn > > > > vacuum-delay on, even when vacuum_cost_delay is updated from 0 to > > > > non-0? > > > > > > Ah yes! Good point. This is true. > > > I'm not sure how to cheaply allow for re-enabling delays after disabling > > > them in the middle of a table vacuum. > > > > > > I don't see a way around checking if we need to reload the config file > > > on every call to vacuum_delay_point() (currently, we are only doing this > > > when we have to wait anyway). It seems expensive to do this check every > > > time. If we do do this, we would update VacuumCostActive when updating > > > VacuumCostDelay, and we would need a global variable keeping the > > > failsafe status, as you mentioned. > > > > > > It could be okay to say that you can only disable cost-based delays in > > > the middle of vacuuming a table (i.e. you cannot enable them if they are > > > already disabled until you start vacuuming the next table). Though maybe > > > it is weird that you can increase the delay but not re-enable it... > > > > On Mon, Mar 20, 2023 at 1:48 AM Melanie Plageman > > <melanieplageman@gmail.com> wrote: > > > So, I thought about it some more, and I think it is a bit odd that you > > > can increase the delay and limit but not re-enable them if they were > > > disabled. And, perhaps it would be okay to check ConfigReloadPending at > > > the top of vacuum_delay_point() instead of only after sleeping. It is > > > just one more branch. We can check if VacuumCostActive is false after > > > checking if we should reload and doing so if needed and return early. > > > I've implemented that in attached v6. > > > > > > I added in the global we discussed for VacuumFailsafeActive. If we keep > > > it, we can probably remove the one in LVRelState -- as it seems > > > redundant. Let me know what you think. > > > > I think the following change is related: > > > > - if (!VacuumCostActive || InterruptPending) > > + if (InterruptPending || VacuumFailsafeActive || > > + (!VacuumCostActive && !ConfigReloadPending)) > > return; > > > > + /* > > + * Reload the configuration file if requested. This allows changes to > > + * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay to > > + * take effect while a table is being vacuumed or analyzed. > > + */ > > + if (ConfigReloadPending && !analyze_in_outer_xact) > > + { > > + ConfigReloadPending = false; > > + ProcessConfigFile(PGC_SIGHUP); > > + AutoVacuumUpdateDelay(); > > + AutoVacuumUpdateLimit(); > > + } > > > > It makes sense to me that we need to reload the config file even when > > vacuum-delay is disabled. But I think it's not convenient for users > > that we don't reload the configuration file once the failsafe is > > triggered. I think users might want to change some GUCs such as > > log_autovacuum_min_duration. > > Ah, okay. Attached v7 has this change (it reloads even if failsafe is > active). > > > > And, I was wondering if it was worth trying to split up the part that > > > reloads the config file and all of the autovacuum stuff. The reloading > > > of the config file by itself won't actually result in autovacuum workers > > > having updated cost delays because of them overwriting it with > > > wi_cost_delay, but it will allow VACUUM to have those updated values. > > > > It makes sense to me to have changes for overhauling the rebalance > > mechanism in a separate patch. > > > > Looking back at the original concern you mentioned[1]: > > > > speed up long-running vacuum of a large table by > > decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the > > config file is only reloaded between tables (for autovacuum) or after > > the statement (for explicit vacuum). > > > > does it make sense to have autovac_balance_cost() update workers' > > wi_cost_delay too? Autovacuum launcher already reloads the config file > > and does the rebalance. So I thought autovac_balance_cost() can update > > the cost_delay as well, and this might be a minimal change to deal > > with your concern. This doesn't have the effect for manual VACUUM but > > since vacuum delay is disabled by default it won't be a big problem. > > As for manual VACUUMs, we would need to reload the config file in > > vacuum_delay_point() as the part of your patch does. Overhauling the > > rebalance mechanism would be another patch to improve it further. > > So, we can't do this without acquiring an access shared lock on every > call to vacuum_delay_point() because cost delay is a double. > > I will work on a patchset with separate commits for reloading the config > file, though (with autovac not benefitting in the first commit). > > On Thu, Mar 23, 2023 at 12:24 PM Daniel Gustafsson <daniel@yesql.se> wrote: > > > > +bool VacuumFailsafeActive = false; > > This needs documentation, how it's used and how it relates to failsafe_active > > in LVRelState (which it might replace(?), but until then). > > Thanks! I've removed LVRelState->failsafe_active. > > I've also separated the VacuumFailsafeActive change into its own commit. @@ -492,6 +493,7 @@ vacuum(List *relations, VacuumParams *params, in_vacuum = true; VacuumCostActive = (VacuumCostDelay > 0); + VacuumFailsafeActive = false; VacuumCostBalance = 0; VacuumPageHit = 0; VacuumPageMiss = 0; I think we need to reset VacuumFailsafeActive also in PG_FINALLY() block in vacuum(). One comment on 0002 patch: + /* + * Reload the configuration file if requested. This allows changes to + * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay to + * take effect while a table is being vacuumed or analyzed. + */ + if (ConfigReloadPending && !analyze_in_outer_xact) + { + ConfigReloadPending = false; + ProcessConfigFile(PGC_SIGHUP); + AutoVacuumUpdateDelay(); + AutoVacuumUpdateLimit(); + } I think we need comments on why we don't reload the config file if we're analyzing a table in a user transaction. > > > I need another few readthroughs to figure out of VacuumFailsafeActive does what > > I think it does, and should be doing, but in general I think this is a good > > idea and a patch in good condition close to being committable. Another approach would be to make VacuumCostActive a ternary value: on, off, and never. When we trigger the failsafe mode we switch it to never, meaning that it never becomes active even after reloading the config file. A good point is that we don't need to add a new global variable, but I'm not sure it's better than the current approach. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: