Re: Autovacuum not keeping up. (PG 9.2.9) - Mailing list pgsql-admin
From | jesper@krogh.cc |
---|---|
Subject | Re: Autovacuum not keeping up. (PG 9.2.9) |
Date | |
Msg-id | 716ad55e559db5d9802de179677d2fb0.squirrel@shrek.krogh.cc Whole thread Raw |
In response to | Re: Autovacuum not keeping up. (PG 9.2.9) (Alvaro Herrera <alvherre@2ndquadrant.com>) |
Responses |
Re: Autovacuum not keeping up. (PG 9.2.9)
Re: Autovacuum not keeping up. (PG 9.2.9) |
List | pgsql-admin |
> jesper@krogh.cc wrote: >> Hi. >> >> I have a large database with a message queue table, that has high >> activity. The database supports 1-300 client connection concurrently, >> having transactions open in up to 30 minutes each. >> >> Recently I am seeing autuvacuum being issued, but it takes >> ages to get through the message queue table, with strace showing waiting >> for semop's for 10's to 100's of seconds. > > Do you have data on how relfrozenxid advances for that table? Not really, how would you normally pick that out? 2014-08-01 10:49:39.171 db=# select relname, age(relfrozenxid), relfrozenxid FROM pg_class WHERE relkind = 'r' and relname = 'job'; relname | age | relfrozenxid ---------+-----------+-------------- job | 111893622 | 796259097 (1 row) Time: 1.913 ms This shouldn't qualify for a freeze vacuum, should it? > Vacuuming needs to grab a "cleanup lock" on each page it's going to > vacuum, which is a special kind of lock that requires that no other > process is even looking at the page at that moment (we call this "to > have the page pinned"), which is even weaker than having a shared lock > on the page. If traffic to some pages is high, it might be difficult > for vacuum to acquire this. Based on strace output this looks excactly like whats going on, there are some activity, then it pauses for some time then continues. > Normally, vacuum doesn't break much sweat about this: if it cannot > acquire the cleanup lock, it ignores the page, keeps calm and carries > on. But if it's a for-wraparound vacuuming, it will need to wait until > it is able to acquire cleanup lock. Can I force it to tell me if it does the for-wraparound cleanup or normal? > I think one idea might be to try to manually vacuum the table once in a > while with a reduced value of min_freeze_table_age. This will cause a > full table scan (i.e. cleanup lock for all pages is waited for), which > decreases the "frozen xid age", which moves the need to do this again > further in the future; so the autovacuum-invoked vacuums will be able to > skip the pages on which it cannot get cleanup lock. > > Another idea is to increase min_freeze_table_age for the queue table > through ALTER TABLE, the idea being that you can delay forced vacuuming > of hot pages for long enough that they can wait until they have cooled > off. Default value is 150 million transactions, which you can raise > tenfold and even higher. > > See > http://www.postgresql.org/docs/9.2/static/runtime-config-client.html#GUC-VACUUM-FREEZE-TABLE-AGE I tried to read through that, but that is a GUC not a storage level parameter. So I (hopefully correct) figured out that the storage level parameter that need to be set were autovacuum_freeze_min_age and set that one to 1.000.000.000 for the table. But apparently it didnt cause any changes. It is still waiting for the lock. Is a database restart required after setting storage parameters, or will autovacuum pick up the new one when it starts over with the table. > The other idea is that heap truncation is what's causing the problem, > but AFAICS that uses conditional lock acquisition so you shouldn't be > seeing stalls in semop(). That should only be once per vacuuming .. right? That doesn't fit the pattern either. -- Jesper
pgsql-admin by date: