Re: another autovacuum scheduling thread - Mailing list pgsql-hackers

From Andres Freund
Subject Re: another autovacuum scheduling thread
Date
Msg-id o33hdbfnosn7pw5e3a34jdtfoaxih6vwbe6rf7bo6ocbn4zv4l@incbnjupgq4o
Whole thread Raw
In response to Re: another autovacuum scheduling thread  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: another autovacuum scheduling thread
Re: another autovacuum scheduling thread
List pgsql-hackers
Hi,

On 2025-10-09 11:01:16 -0500, Nathan Bossart wrote:
> On Wed, Oct 08, 2025 at 01:37:22PM -0400, Andres Freund wrote:
> > On 2025-10-08 10:18:17 -0500, Nathan Bossart wrote:
> >> The attached patch works by storing the maximum of the XID age and the MXID
> >> age in the list with the OIDs and sorting it prior to processing.
> > 
> > I think it may be worth trying to avoid reliably using the same order -
> > otherwise e.g. a corrupt index on the first scheduled table can cause
> > autovacuum to reliably fail on the same relation, never allowing it to
> > progress past that point.
> 
> Hm.  What if we kept a short array of "failed" tables in shared memory?

I've thought about having that as part of pgstats...


> Each worker would consult this table before processing.  If the table is
> there, it would remove it from the shared table and skip processing it.
> Then the next worker would try processing the table again.
> 
> I also wonder how hard it would be to gracefully catch the error and let
> the worker continue with the rest of its list...

The main set of cases I've seen are when workers get hung up permanently in
corrupt indexes. There never is actually an error, the autovacuums just get
terminated as part of whatever independent reason there is to restart. The
problem with that is that you'll never actually have vacuum fail...

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: another autovacuum scheduling thread
Next
From: Bruce Momjian
Date:
Subject: Re: compiling pg_bsd_indent