On Thu, 9 Oct 2025 at 14:47, Jeremy Schneider <schneider@ardentperf.com> wrote:
>
> On Wed, 8 Oct 2025 18:25:20 -0700
> Jeremy Schneider <schneider@ardentperf.com> wrote:
> > If users are tuning this thing then I feel like we've already lost the
> > battle :)
>
> I replied too quickly. Re-reading your email, I think your proposing a
> different algorithm, taking tuple counts into account. No tunables. Is
> there a fully fleshed out version of the proposed alternative algorithm
> somewhere? (one of the older threads?) I guess this is why its so hard
> to get anything committed in this area...
It's along the lines of the "1a)" from [1]. I don't think that post
does a great job of explaining it.
I think the best way to understand it is if you look at
relation_needs_vacanalyze() and see how it calculates boolean values
for boolean output params. So, instead of calculating just a boolean
value it instead calculates a float4 where < 1.0 means don't do the
operation and anything >= 1.0 means do the operation. For example,
let's say a table has 600 dead rows and the scale factor and threshold
settings mean that autovacuum will trigger at 200 (3 times more dead
tuples than the trigger point). That would result in the value of 3.0
(600 / 200). The priority for relfrozenxid portion is basically
age(relfrozenxid) / autovacuum_freeze_max_age (plus need to account
for mxid by doing the same for that and taking the maximum of each
value). For each of those component "scores", the priority for
autovacuum would be the maximum of each of those.
Effectively, it's a method of aligning the different units of measure,
transactions or tuples into a single value which is calculated based
on the very same values that we use today to trigger autovacuums.
David
[1] https://postgr.es/m/CAApHDvo8DWyt4CWhF=NPeRstz_78SteEuuNDfYO7cjp=7YTK4g@mail.gmail.com