Re: [HACKERS] Rethinking autovacuum.c memory handling - Mailing list pgsql-hackers
From | Alvaro Herrera |
---|---|
Subject | Re: [HACKERS] Rethinking autovacuum.c memory handling |
Date | |
Msg-id | 20170923103050.decvu4b3qqnjti67@alvherre.pgsql Whole thread Raw |
In response to | [HACKERS] Rethinking autovacuum.c memory handling (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [HACKERS] Rethinking autovacuum.c memory handling
|
List | pgsql-hackers |
Tom Lane wrote: > I notice that autovacuum.c calls autovacuum_do_vac_analyze, and > thereby vacuum(), in TopTransactionContext. This doesn't seem > like a terribly great idea, because it doesn't correspond to what > happens during a manually-invoked vacuum. TopTransactionContext > will go away when vacuum() commits the outer transaction, whereas > in non-autovac usage, we call vacuum() in a PortalHeapMemory > context that is not a child of TopTransactionContext and is not > at risk of being reset multiple times during the vacuum(). This'd > be a hazard if autovacuum_do_vac_analyze or vacuum did any palloc's > before getting to the main loop. More generally, I'm not aware of > other cases where we invoke a function in a context that we know > that function will destroy as it executes. Oh, good catch. This must be a very old oversight -- I bet it goes all the way back to the introduction of autovacuum. I think if there were any actual bugs, we would have noticed by now. > I don't see any live bug associated with this in HEAD, but this behavior > requires a rather ugly (and memory-leaking) workaround in the proposed > patch to allow multiple vacuum target rels. Well, it's definitely broken, so I'd vote for fixing it even if we weren't considering that patch. > What I think we should do instead is invoke autovacuum_do_vac_analyze > in the PortalContext that do_autovacuum has created, which we already > have a mechanism to reset once per table processed in do_autovacuum. Sounds sensible. > The attached patch does that, and also modifies perform_work_item() > to use the same approach. Right now perform_work_item() has a > copied-and-pasted MemoryContextResetAndDeleteChildren(PortalContext) > call in its error recovery path, but that seems a bit out of place > given that perform_work_item() isn't using PortalContext otherwise. Oops :-( > PS: I was disappointed to find out that perform_work_item() isn't > exercised at all in the standard regression tests. Yeah ... It's fairly simple to create a test that will invoke it a few times, but one problem is that it depends on autovacuum's timing. If I add this in brin.sql -- Test BRIN work items CREATE TABLE brin_work_items (LIKE brintest) WITH (fillfactor = 10); CREATE INDEX brin_work_items_idx ON brin_work_items USING brin (textcol) WITH (autosummarize = on, pages_per_range=1); INSERT INTO brin_work_items SELECT * FROM brintest; the work item is only performed about 15 seconds after the complete regression tests are finished, which I fear would mean "never" in practice. One idea I just had is to somehow add it to src/test/modules/brin, and set up the postmaster in that test with autovacuum_naptime=1s. I'll go check how feasible that is. (By placing it there we could also verify that the index does indeed contain the index entries we expect, since it has pageinspect available.) -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
pgsql-hackers by date: