Re: another autovacuum scheduling thread - Mailing list pgsql-hackers

From Robert Haas
Subject Re: another autovacuum scheduling thread
Date
Msg-id CA+TgmoYC4ShRp8vcyrBjkefSBdFfY1fnUgCvjocN-iq55G-7bA@mail.gmail.com
Whole thread Raw
In response to Re: another autovacuum scheduling thread  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: another autovacuum scheduling thread
List pgsql-hackers
On Fri, Oct 10, 2025 at 1:31 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> Here's a prototype of a "score" approach.  Two notes:
>
> * I've given special priority to anti-wraparound vacuums.  I think this is
> important to avoid focusing too much on bloat when wraparound is imminent.
> In any case, we need a separate wraparound score in case autovacuum is
> disabled.
>
> * I didn't include the analyze threshold in the score because it doesn't
> apply to TOAST tables, and therefore would artificially lower their
> prioritiy.  Perhaps there is another way to deal with this.
>
> This is very much just a prototype of the basic idea.  As-is, I think it'll
> favor processing tables with lots of bloat unless we're in an
> anti-wraparound scenario.  Maybe that's okay.  I'm not sure how scientific
> we want to be about all of this, but I do intend to try some long-running
> tests.

I think this is a reasonable starting point, although I'm surprised
that you chose to combine the sub-scores using + rather than Max.

I think it will take a lot of experimentation to figure out whether
this particular algorithm (or any other) works well in practice. My
intuition (for whatever that is worth to you, which may not be much)
is that what will anger users is cases when we ignore a horrible
problem to deal with a routine problem. Figuring out how to design the
scoring system to avoid such outcomes is the hard part of this
problem, IMHO. For this particular algorithm, the main hazards that
spring to mind for me are:

- The wraparound score can't be more than about 10, but the bloat
score could be arbitrarily large, especially for tables with few
tuples, so there may be lots of cases in which the wraparound score
has no impact on the behavior.

- The patch attempts to guard against this by disregarding the
non-wraparound portion of the score once the wraparound portion
reaches 1.0, but that results in an abrupt behavior shift at that
point. Suddenly we go from mostly ignoring the wraparound score to
entirely ignoring the bloat score. This might result in the system
abruptly ignoring tables that are bloating extremely rapidly in favor
of trying to catch up in a wraparound situation that is not yet
terribly urgent.

When I've thought about this problem -- and I can't claim to have
thought about it very hard -- it's seemed to me that we need to (1)
somehow normalize everything to somewhat similar units and (2) make
sure that severe wraparound danger always wins over every other
consideration, but mild wraparound danger can lose to severe bloat.

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: Clarification on Role Access Rights to Table Indexes
Next
From: "Joel Jacobson"
Date:
Subject: Re: Optimize LISTEN/NOTIFY