Re: Fwd: PG12 autovac issues - Mailing list pgsql-general
From | Adrian Klaver |
---|---|
Subject | Re: Fwd: PG12 autovac issues |
Date | |
Msg-id | dc39a770-4db9-aec2-e54c-88e9b916973f@aklaver.com Whole thread Raw |
In response to | Fwd: PG12 autovac issues (Justin King <kingpin867@gmail.com>) |
Responses |
Re: Fwd: PG12 autovac issues
|
List | pgsql-general |
On 3/17/20 3:22 PM, Justin King wrote: > Apologies, I accidentally sent this to the pgsql-admin list initially > but intended it go here: > > We have a database that isn't overly large (~20G), but gets incredibly > frequent updates. Here's an example table: > > feedi=# select * from pg_stat_all_tables where schemaname = > 'production' and relname = 'tita'; > relid = 16786 > schemaname = production > relname = tita > seq_scan = 23365 > seq_tup_read = 403862091 > idx_scan = 26612759248 > idx_tup_fetch = 19415752701 > n_tup_ins = 24608806 > n_tup_upd = 4207076934 > n_tup_del = 24566916 > n_tup_hot_upd = 4073821079 > n_live_tup = 79942 > n_dead_tup = 71969 > n_mod_since_analyze = 12020 > last_vacuum = 2020-03-17 15:35:19.588859+00 > last_autovacuum = 2020-03-17 21:31:08.248598+00 > last_analyze = 2020-03-17 15:35:20.372875+00 > last_autoanalyze = 2020-03-17 22:04:41.76743+00 > vacuum_count = 9 > autovacuum_count = 135693 > analyze_count = 9 > autoanalyze_count = 495877 > > As you can see in this table, there are only ~80K rows, but billions > of updates. What we have observed is that the frozenxid reaches the > 200M mark fairly quickly because of the amount of activity. What is > interesting is that this happens with the 'postgres' and 'template1' > databases as well and there is absolutely no activity in those > databases. > > When the 'postgres' and/or 'template1' databases hit the > freeze_max_age, there are cases where it kicks off an aggressive > autovac of those tables which seems to prevent autovacs from running > elsewhere. Oddly, this is not consistent, but that condition seems to > be required. We have observed this across multiple PG12 servers (dev, > test, staging, production) all with similar workloads. Is there anything in postgres and template1 besides what was created at init? What are your settings for autovacuum?: https://www.postgresql.org/docs/12/runtime-config-autovacuum.html > > $ grep -i vacuum /var/log/postgresql/postgres.log | cut -b 1-9 | uniq -c > 17 Mar 17 06 > 34 Mar 17 07 > 31 Mar 17 08 > 31 Mar 17 09 > 30 Mar 17 10 > 34 Mar 17 11 > 33 Mar 17 12 > 19 Mar 17 13 > 40 Mar 17 15 > 31 Mar 17 16 > 36 Mar 17 17 > 34 Mar 17 18 > 35 Mar 17 19 > 35 Mar 17 20 > 33 Mar 17 21 > > As you can see above, we end up having around ~33 autovac/hr, and > about 13:30 today, they stopped until we ran a "vacuum freeze verbose > analyze;" against the 'postgres' database (around 15:30) which then > caused the autovacs to resume running against the "feedi" database. > > I'm completely perplexed as to what is happening and why it suddenly > started when we moved from PG10 > PG12. The configs and workload are > essentially the same between versions. We realize we could simply > increase the autovacuum_freeze_max_age, but that doesn't seem to > actually resolve anything -- it just pushes the problem out. Has > anyone seen anything similar to this? > > Thanks very much for the consideration. > > Justin King > http://flightaware.com/ > > -- Adrian Klaver adrian.klaver@aklaver.com
pgsql-general by date: