BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon - Mailing list pgsql-bugs
From | PG Bug reporting form |
---|---|
Subject | BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon |
Date | |
Msg-id | 17973-bca1f7d5c14f601e@postgresql.org Whole thread Raw |
Responses |
Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon
|
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 17973 Logged by: Will Mortensen Email address: will@extrahop.com PostgreSQL version: 15.3 Operating system: Ubuntu 22.04 Description: My colleague Jacob Speidel (jacob@extrahop.com) and I have diagnosed a race condition that, under certain conditions, can cause the autovacuum daemon to stop launching autovacuum workers until the autovacuum daemon (or the whole server) is restarted. This obviously causes serious problems for the server. We've reproduced this on both 15.2 and REL_15_STABLE (git commit bd590d1fea1ba9245c791d589eea94d2dbad5a2b). We've never seen a similar issue on 14 despite very similar conditions occurring frequently. We haven't yet tried on 16. In Jacob's repro, the problem begins when AutoVacWorkerMain() is in InitPostgres() and some other backend drops the database. Specifically, the autovacuum worker's InitPostgres() is just about to obtain the lock at https://github.com/postgres/postgres/blob/bd590d1fea1ba9245c791d589eea94d2dbad5a2b/src/backend/utils/init/postinit.c#L1012 . The backend dropping the DB marks the DB's pgstats entry as dropped but can’t free it because its refcount is nonzero, so AtEOXact_PgStat_DroppedStats() calls pgstat_request_entry_refs_gc(). The autovacuum worker's InitPostgres() proceeds to call GetDatabaseTuple() and notices the database has been dropped, at some point calling pgstat_gc_entry_refs() to release its reference to the DB's pgstats entry, and the worker decides to exit with a fatal error. But while exiting, it calls pgstat_report_stat(), which calls pgstat_update_dbstats() for the dropped DB, which calls pgstat_prep_database_pending(), which calls pgstat_prep_pending_entry(), which calls pgstat_get_entry_ref() with create == true, which calls pgstat_reinit_entry() against the DB's pgstats entry. This sets dropped to false on that entry. Finally, the autovacuum worker exits. The fact that a dropped database now indefinitely has a pgstats entry with dropped == false seemingly violates some assumptions and confuses the autovacuum daemon. In particular, rebuild_database_list() will forever include it in DatabaseList and take it into account when computing the adl_next_worker for each DB, but do_start_worker() won’t consider the DB because it's never returned by get_database_list(). In our repros, this mismatch causes do_start_worker() to get stuck never processing any DB: in particular, it always sees that for each non-dropped database adl_next_worker is in the future but within the next autovacuum_naptime, i.e. skipit = true. This causes do_start_worker() to call rebuild_database_list() at the end, which again miscomputes adl_next_worker and pushes it further into the future, so that the situation repeats on the next call to do_start_worker(), and so on indefinitely. That’s the crux of our issue. Please let us know if any clarification or more detailed repro steps are needed. Our repro patch just sleeps before and after the LockSharedObject() call in InitPostgres() (to widen the race windows) and adds a lot of logging. (Jacob did >90% of the debugging here; I merely determined how the pgstats entry lost its dropped flag.) We assume that one fix would be to somehow ensure that the dropped flag remains true on a dropped database’s pgstats entry until it’s freed, but also, it seems a bit fragile for autovacuum’s do_start_worker() to sometimes call rebuild_database_list() and delay all the adl_next_worker times. Without thinking about it too hard, we wonder if there would still be a pattern of ongoing DB creates and drops that could cause it to misbehave in a similar way, never deciding to autovacuum any database even if one lives long enough that it should be autovacuumed.
pgsql-bugs by date: