Re: autovacuum not prioritising for-wraparound tables - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: autovacuum not prioritising for-wraparound tables |
Date | |
Msg-id | CA+TgmoZVaFnV83v=AjT1-=TeNQtnPoEWNwM3ydxt4+bts2=x2A@mail.gmail.com Whole thread Raw |
In response to | Re: autovacuum not prioritising for-wraparound tables (Andres Freund <andres@2ndquadrant.com>) |
Responses |
Re: autovacuum not prioritising for-wraparound tables
|
List | pgsql-hackers |
On Sat, Feb 2, 2013 at 8:41 AM, Andres Freund <andres@2ndquadrant.com> wrote: >> - It's probably important to have a formula where we can be sure that >> the wrap-around term will eventually dominate the dead-tuple term, >> with enough time to spare to make sure nothing really bad happens; on >> the other hand, it's also desirable to avoid the case where a table >> that has just crossed the threshold for wraparound vacuuming doesn't >> immediately shoot to the top of the list even if it isn't truly >> urgent. It's unclear to me just from looking at this formula how well >> the second term meets those goals. > > I just wanted to mention that if everything goes well, we won't *ever* > get to an anti-wraparound-vacuum. Normally the table should cross the > vacuum_table_age barrier earlier and promote a normal vacuum to a > full-table vacuum which will set relfrozenxid to a new and lower value > and thus prevent anti-wraparound vacuums from occurring. > So priorizing anti-wraparound vacuums immediately and heavily doesn't > seem to be too bad. IMHO, this is hopelessly optimistic. Yes, it's intended to work that way. But INSERT-only or INSERT-mostly tables are far from an uncommon use case; and in fact they're probably the most common cause of pain in this area. You insert a gajillion tuples, and vacuum never kicks off, and then eventually you either update some tuples or hit autovacuum_freeze_max_age and suddenly, BAM, you get this gigantic vacuum that rewrites the entire table. And then you open a support ticket with your preferred PostgreSQL support provider and say something like "WTF?". >> - More generally, it seems to me that we ought to be trying to think >> about the units in which these various quantities are measured. Each >> term ought to be unit-less. So perhaps the first term ought to divide >> dead tuples by total tuples, which has the nice property that the >> result is a dimensionless quantity that never exceeds 1.0. Then the >> second term can be scaled somehow based on that value. > > I think we also need to be careful to not try to get too elaborate on > this end. Once the general code for priorization is in, the exact > priorization formula can be easily incrementally tweaked. Just about any > half-way sensible priorization is better than what we have right now and > we might discover new effects once we do marginally better. I agree. It would be nice to have some way of measuring the positive or negative impact of what we introduce, too, but I don't have a good idea what that would be. > Imo the browne_strength field should be called 'priority' and the > priorization calculation formula should be moved qinto an extra > function. Yeah, or maybe vacuum_priority, since that would be easier to grep for. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: