Vacuuming leaked temp tables (once again) - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Vacuuming leaked temp tables (once again) |
Date | |
Msg-id | 3507.1214581513@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: Vacuuming leaked temp tables (once again)
Re: Vacuuming leaked temp tables (once again) Re: Vacuuming leaked temp tables (once again) |
List | pgsql-hackers |
This thread http://archives.postgresql.org/pgsql-hackers/2008-01/msg00134.php kind of wandered off into the weeds after identifying a semi-related bug in CLUSTER, but the original problem still remains: if a backend crashes after creating some temp tables, the tables remain present. Such tables will get recycled next time someone reuses the same pg_temp_NNN schema. But if the failed backend had been occupying an unusually high-numbered BackendId slot, then its pg_temp_NNN schema might go unused for a long time --- long enough for the temp tables to pose an xid-wraparound problem. There's another report of this issue today in pgsql-general. The only solution proposed in that thread was to auto-delete temp tables at postmaster restart; which I opposed on the grounds that throwing away data right after a crash was a terrible idea from a forensic standpoint. I still think that, but I had another idea about how to cope with the situation. It's reasonably easy to tell (by looking into the sinval state) whether a given BackendId slot is actually in use, so we could detect whether a temp table actually belongs to a live backend or not. What I'm thinking is we should adjust autovacuum so that it will apply anti-wraparound vacuuming operations even to temp tables, if they belong to pg_temp schemas that belong to inactive BackendId slots. This'd fix the wraparound issue without any risk of discarding data that someone might want back. Note that this should be safe even if someone claims the pg_temp_NNN schema and tries to drop the old temp table while we're vacuuming it. Operations on temp tables take the normal types of locks, so that will get interlocked properly. A small hole in this idea is that the BackendId slot might be occupied by some new backend that actually hasn't created any temp tables yet (hence not "taken possession" of the pg_temp_NNN schema). We could fix that by making each backend's has-temp-tables state globally visible. However, I'm inclined to think it's not really an issue, because you wouldn't get into trouble unless this was always the case over many repeated autovacuum visits to the table, which seems pretty improbable. Another issue is that leftover temp tables would be significantly more likely to be self-inconsistent than normal tables, since operations on them are not WAL-logged and it's entirely likely that the owning backend crashed with some dirty pages not written out from its local buffers. AFAICS this shouldn't be any big problem for vacuuming the table proper, since heap pages are pretty independent, at least at the level understood by plain vacuum. There is a risk that indexes would be corrupt enough to make vacuum error out, thus preventing the xid wraparound cleanup from completing. But that leaves us no worse off than we are now, and at least there would be signs of distress in the postmaster log for the DBA to see. Or we could have autovacuum just drop orphaned temp tables, *if* they have gotten old enough to need anti-wraparound vacuuming. While I'm still uncomfortable with having autovac drop anything, at least this would avoid the worst cases of "gee I really needed that data to investigate the crash". The main attractions of this idea are avoiding the corrupt-index issue and not doing vacuuming work that's 99.99% sure to be useless. Thoughts? regards, tom lane
pgsql-hackers by date: