On Wed, Oct 08, 2025 at 01:37:22PM -0400, Andres Freund wrote:
> On 2025-10-08 10:18:17 -0500, Nathan Bossart wrote:
>> The attached patch works by storing the maximum of the XID age and the MXID
>> age in the list with the OIDs and sorting it prior to processing.
>
> I think it may be worth trying to avoid reliably using the same order -
> otherwise e.g. a corrupt index on the first scheduled table can cause
> autovacuum to reliably fail on the same relation, never allowing it to
> progress past that point.
Hm. What if we kept a short array of "failed" tables in shared memory?
Each worker would consult this table before processing. If the table is
there, it would remove it from the shared table and skip processing it.
Then the next worker would try processing the table again.
I also wonder how hard it would be to gracefully catch the error and let
the worker continue with the rest of its list...
--
nathan