Re: stats collector suddenly causing lots of IO - Mailing list pgsql-performance
From | Scott Carey |
---|---|
Subject | Re: stats collector suddenly causing lots of IO |
Date | |
Msg-id | C449E205-8B0B-4024-B647-4CC6F2C46CA4@richrelevance.com Whole thread Raw |
In response to | Re: stats collector suddenly causing lots of IO (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: stats collector suddenly causing lots of IO
|
List | pgsql-performance |
On Apr 16, 2010, at 9:48 AM, Tom Lane wrote: > Josh Kupershmidt <schmiddy@gmail.com> writes: >> On Fri, Apr 16, 2010 at 11:41 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Wow. Well, we have a smoking gun here: for some reason, autovacuum >>> isn't running, or isn't doing its job if it is. If it's not running >>> at all, that would explain failure to prune the stats collector's file >>> too. > >> Hrm, well autovacuum is at least trying to do work: it's currently >> stuck on those bloated pg_catalog tables, of course. Another developer >> killed an autovacuum of pg_attribute (or maybe it was pg_attrdef) >> after it had been running for two weeks. See current pg_stat_activity >> output attached, which shows the three autovacuum workers running plus >> two manual VACUUM ANALYZEs I started yesterday. > > Two weeks? What have you got the autovacuum cost delays set to? > > Once you're up to three AV workers, no new ones can get launched until > one of those finishes or is killed. So that would explain failure to > prune the stats collector's tables (the tabpurge code is only run during > AV worker launch). So what we need to figure out is why it's taking so > obscenely long to vacuum these tables ... > On any large system with good I/O I have had to significantly increase the aggressiveness of autovacuum. Even with the below settings, it doesn't interfere with other activity (~2200iops random, ~900MB/sec sequential capable I/O). My relevant autovacuum parameters are (from 'show *'): autovacuum | on | Starts the autovacuum subprocess. autovacuum_analyze_scale_factor | 0.1 | Number of tuple inserts, updates or deletes priorto analyze as a fraction of reltuples. autovacuum_analyze_threshold | 50 | Minimum number of tuple inserts, updates or deletesprior to analyze. autovacuum_freeze_max_age | 200000000 | Age at which to autovacuum a table to preventtransaction ID wraparound. autovacuum_max_workers | 3 | Sets the maximum number of simultaneously runningautovacuum worker processes. autovacuum_naptime | 1min | Time to sleep between autovacuum runs. autovacuum_vacuum_cost_delay | 20ms | Vacuum cost delay in milliseconds, for autovacuum. autovacuum_vacuum_cost_limit | 2000 | Vacuum cost amount available before napping, forautovacuum. autovacuum_vacuum_scale_factor | 0.2 | Number of tuple updates or deletes prior to vacuumas a fraction of reltuples. autovacuum_vacuum_threshold | 50 For what it is worth, I just went onto one of my systems -- one with lots of partition tables and temp table creation/destruction-- and looked at the system tables in question there. Postgres 8.4, using dt+ (trimmed result below to interesting tables) Schema | Name | Type | Owner | Size | Description ------------+-------------------------+-------+----------+------------+------------- pg_catalog | pg_attrdef | table | postgres | 195 MB | pg_catalog | pg_attribute | table | postgres | 1447 MB | pg_catalog | pg_class | table | postgres | 1694 MB | pg_catalog | pg_constraint | table | postgres | 118 MB | pg_catalog | pg_depend | table | postgres | 195 MB | pg_catalog | pg_statistic | table | postgres | 2300 MB | pg_catalog | pg_type | table | postgres | 181 MB | So, I did a vacuum full; reindex table; analyze; sequence on each of these. I wish I could just CLUSTER them but the aboveworks. now the tables are: Schema | Name | Type | Owner | Size | Description ------------+-------------------------+-------+----------+------------+------------- pg_catalog | pg_attrdef | table | postgres | 44 MB | pg_catalog | pg_attribute | table | postgres | 364 MB | pg_catalog | pg_class | table | postgres | 1694 MB | pg_catalog | pg_constraint | table | postgres | 118 MB | pg_catalog | pg_depend | table | postgres | 195 MB | pg_catalog | pg_statistic | table | postgres | 656 MB | pg_catalog | pg_type | table | postgres | 45 MB | I've learned to accept about 50% bloat (2x the compacted size) in postgres as just the way it usually is on a busy table,but the 3x and 4x bloat of statistic, attrdef, and attribute have me wondering. I have had some 'idle in transaction' connections hanging out from time to time that have caused issues on this machine thatcould explain the above perma-bloat. That is one thing that could affect the case reported here as well. The worstthing about those, is you can't even force kill those connections from within postgres (pg_cancel_backend doesn't workon them, and killing them via the OS bounces postgres ...) so you have to hunt down the offending client. > regards, tom lane > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance
pgsql-performance by date: