Re: [HACKERS] More stats about skipped vacuums - Mailing list pgsql-hackers
From | Kyotaro HORIGUCHI |
---|---|
Subject | Re: [HACKERS] More stats about skipped vacuums |
Date | |
Msg-id | 20171121.160957.124460918.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
In response to | Re: [HACKERS] More stats about skipped vacuums (Michael Paquier <michael.paquier@gmail.com>) |
Responses |
Re: [HACKERS] More stats about skipped vacuums
Re: [HACKERS] More stats about skipped vacuums |
List | pgsql-hackers |
Thank you for the comments. At Sat, 18 Nov 2017 22:23:20 +0900, Michael Paquier <michael.paquier@gmail.com> wrote in <CAB7nPqQV1Emkj=5VFzui250T6v+xcpRQ2RfHu_oQMbdXnZw3mA@mail.gmail.com> > On Thu, Nov 16, 2017 at 7:34 PM, Kyotaro HORIGUCHI > <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > > At Wed, 15 Nov 2017 16:13:01 +0900, Michael Paquier <michael.paquier@gmail.com> wrote in <CAB7nPqQm_WCKuUf5RD0CzeMuMO907ZPKP7mBh-3t2zSJ9jn+PA@mail.gmail.com> > >> Please use spaces instead of tabs. Indentation is not consistent. > > > > Done. Thank you for pointing. (whitespace-mode showed me some > > similar inconsistencies at the other places in the file...) > > Yes, I am aware of those which get introduced here and there. Let's > not make things worse.. Year, I agree with it. > >> + case PGSTAT_VACUUM_CANCELED: > >> + phase = tabentry->vacuum_last_phase; > >> + /* number of elements of phasestr above */ > >> + if (phase >= 0 && phase <= 7) > >> + result = psprintf("%s while %s", > >> + status == PGSTAT_VACUUM_CANCELED ? > >> + "canceled" : "error", > >> + phasestr[phase]); > >> Such complication is not necessary. The phase parameter is updated by > >> individual calls of pgstat_progress_update_param(), so the information > >> showed here overlaps with the existing information in the "phase" > >> field. > > > > The "phase" is pg_stat_progress_vacuum's? If "complexy" means > > phasestr[phase], the "phase" cannot be overlap with > > last_vacuum_status since pg_stat_progress_vacuum's entry has > > already gone when someone looks into pg_stat_all_tables and see a > > failed vacuum status. Could you give a bit specific comment? > > I mean that if you tend to report this information, you should just > use a separate column for it. Having a single column report two > informations, which are here the type of error and potentially the > moment where it appeared are harder to parse. Thanks for the explanation. Ok, now "last_vacuum_status" just show how the last vacuum or autovacuum finished, in "completed", "error", "canceled" and "skipped". "last_vacuum_status_detail" shows the phase at exiting if "error" or "canceled". They are still in a bit complex relationship. (pgstatfuncs.c) "error" and "cancel" could be unified since the error code is already shown in log. last_vac_status | last_vac_stat_detail ================+======================= "completed" | (null)/"aggressive"/"full" + "partially truncated" + | "not a target" "skipped" | "lock failure" "error" | <errcode> + <phase> "canceled" | <phase> > >> However, progress reports are here to allow users to do decisions > >> based on the activity of how things are working. This patch proposes > >> to add multiple new fields: > >> - oldest Xmin. > >> - number of index scans. > >> - number of pages truncated. > >> - number of pages that should have been truncated, but are not truncated. > >> Among all this information, as Sawada-san has already mentioned > >> upthread, the more index scans the less dead tuples you can store at > >> once, so autovacuum_work_mem ought to be increases. This is useful for > >> tuning and should be documented properly if reported to give > >> indications about vacuum behavior. The rest though, could indicate how > >> aggressive autovacuum is able to remove tail blocks and do its work. > >> But what really matters for users to decide if autovacuum should be > >> more aggressive is tracking the number of dead tuples, something which > >> is already evaluated. > > > > Hmm. I tend to agree. Such numbers are better to be shown as > > average of the last n vacuums or maximum. I decided to show > > last_vacuum_index_scan only and I think that someone can record > > it continuously to elsewhere if wants. > > As a user, what would you make of those numbers? How would they help > in tuning autovacuum for a relation? We need to clear up those > questions before thinking if there are cases where those are useful. Ah, I found what you meant. The criteria to choose the numbers in the previous patch was just what is not logged, and usable to find whether something wrong is happening on vacuum. So # of index scans was not in the list. The objective here is to find the health of vacuum just by looking into stats views. vacuum_required: apparently cannot be logged, and it is not so easy to calculate. last_vacuum_index_scans: It is shown in the log, but I agree that it is usable to find maintenance_work_memis too small. last_vacuum_status: It is logged, but it is likely for users to examine it after something bad has happend. "complete" in this column immediately shows that vacuum on the table is perfectly working. last_vacuum_status_detail: The cause of cancel or skipping is not logged but always it is hard to find outwhat is wrong. This narrows the area for users and/or support to investigate. autovacuum_fail_count: When vacuum has not executed for a long time, users cannot tell wheter vacuum is not required at all or vacuum trials have been skipped/canceled. This makes distinction between the two cases. last_vacuum_untruncated: This is not shown in a log entry. Uses can find that trailing empty pages are leftuntruncted. last_vacuum_truncated: This is shown in the log. This is just here in order to be compared to untruncte since# untruncated solely doesn't have meaning. Or conversely can find that relations are *unwantedly* truncated (as my understanding of the suggestion from Alvaro) last_vacuum_oldest_xmin: A problem very frequently happens is table bloat caused by long transactions. > >> Tracking the number of failed vacuum attempts is also something > >> helpful to understand how much the job is able to complete. As there > >> is already tracking vacuum jobs that have completed, it could be > >> possible, instead of logging activity when a vacuum job has failed, to > >> track the number of *begun* jobs on a relation. Then it is possible to > >> guess how many have failed by taking the difference between those that > >> completed properly. Having counters per failure types could also be a > >> possibility. > > > > Maybe pg_stat_all_tables is not the place to hold such many kinds > > of vacuum specific information. pg_stat_vacuum_all_tables or > > something like? > > What do you have in mind? pg_stat_all_tables already includes counters Nothing specific in my mind. > about the number of vacuums and analyze runs completed. I guess that > the number of failures, and the types of failures ought to be similar > counters at the same level. Yes, my concern here is how many column we can allow in a stats view. I think I'm a bit too warried about that. > >> For this commit fest, I would suggest a patch that simply adds > >> tracking for the number of index scans done, with documentation to > >> give recommendations about parameter tuning. i am switching the patch > >> as "waiting on author". > > > > Ok, the patch has been split into the following four parts. (Not > > split by function, but by the kind of information to add.) > > The first one is that. > > > > 0001. Adds pg_stat_all_tables.last_vacuum_index_scans. Documentation is added. > > > > 0002. Adds pg_stat_all_tables.vacuum_required. And primitive documentation. > > > > 0003. Adds pg_stat_all_tables.last_vacuum_status/autovacuum_fail_count > > plus primitive documentation. > > > > 0004. truncation information stuff. > > > > One concern on pg_stat_all_tables view is the number of > > predefined functions it uses. Currently 20 functions and this > > patch adds more seven. I feel it's better that at least the > > functions this patch adds are merged into one function.. > > For the scope of this commit fest, why not focusing only on 0001 with > the time that remains? This at least is something I am sure will be > useful. Year, so I separated the 0001 patch, but it was not my intention in this thread. It was 0002 and 0003 so I'd like *show* them with 0001 and focusing on 0001 for this commit fest is fine to me. > <para> > + Vacuuming scans all index pages to remove index entries that pointed > + the removed tuples. In order to finish vacuuming by as few index > + scans as possible, the removed tuples are remembered in working > + memory. If this setting is not large enough, vacuuming runs > + additional index scans to vacate the memory and it might cause a > + performance problem. That behavior can be monitored > + in <xref linkend="pg-stat-all-tables-view">. > + </para> > Why not making that the third paragraph, after autovacuum_work_mem has > been mentioned for the first time? This could be reworded as well. Just to place the Note at the last paragrah. The Note is mentioning multiplication of autovacuum_work_mem, not about the guc itself. Anyway I swapped them in this version. > Short idea: > Vacuum scans all index pages to remove index entries that pointed to > dead tuples. Finishing vacuum with a minimal number of index scans > reduces the time it takes to complete it, and a new scan is triggered > once the in-memory storage for dead tuple pointers gets full, whose > size is defined by autovacuum_work_mem. So increasing this parameter > can make the operation finish more quickly. This can be monitored with > pg_stat_all_tables. I thought that it *must* be reworded anyway (because of my poor wording). Thanks for rewording. I find this perfect. > pg_stat_get_vacuum_count(C.oid) AS vacuum_count, > + pg_stat_get_last_vacuum_index_scans(C.oid) AS > last_vacuum_index_scans, > pg_stat_get_autovacuum_count(C.oid) AS autovacuum_count, > Counters with counters, and last vacuum info with last vacuum info, no? Moved it to above vacuum_count. By the way I'm uneasy that the 'last_vacuum_index_scans' (and vacuum_fail_count in 0002 and others in 0003, 0004) is mentioning both VACUUM command and autovacuum, while last_vacuum and vacuum_count is mentioning only the command. Splitting it into vacuum/autovaccum seems nonsense but the name is confusing. Do you have any idea? regards, -- Kyotaro Horiguchi NTT Open Source Software Center From af124a675637c44781ff84a979e6d9d0afb1e8d4 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp> Date: Tue, 21 Nov 2017 10:47:52 +0900 Subject: [PATCH 4/4] Add truncation information to pg_stat_all_tables This patch adds truncated and tried-but-not-truncated pages in the last vacuum. This is intended to use to find uncertain failure of truncation or unwanted aggressive trancation. ---doc/src/sgml/monitoring.sgml | 10 ++++++++++src/backend/catalog/system_views.sql | 2 ++src/backend/commands/vacuum.c | 14 ++++++++------src/backend/commands/vacuumlazy.c | 13 ++++++++++++-src/backend/postmaster/pgstat.c | 10 +++++++++-src/backend/utils/adt/pgstatfuncs.c | 32 +++++++++++++++++++++++++++++++-src/include/catalog/pg_proc.h | 4 ++++src/include/pgstat.h | 6 ++++++src/test/regress/expected/rules.out | 6 ++++++9 files changed, 88 insertions(+), 9 deletions(-) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index a0288cb..fd0507a 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -2585,6 +2585,16 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i <entry>Oldestxmin used by the last vacuum on this table</entry> </row> <row> + <entry><structfield>last_vacuum_truncated</structfield></entry> + <entry><type>bigint</type></entry> + <entry>Number actually truncated pages during the last vacuum on this table</entry> + </row> + <row> + <entry><structfield>last_vacuum_untruncated</structfield></entry> + <entry><type>bigint</type></entry> + <entry>Number tried but actually not truncated pages during the last vacuum on this table</entry> + </row> + <row> <entry><structfield>last_vacuum_status</structfield></entry> <entry><type>text</type></entry> <entry>Statusof the last vacuum or autovacuum.</entry> diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index c69fea9..528e9c5 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -529,6 +529,8 @@ CREATE VIEW pg_stat_all_tables AS pg_stat_get_last_analyze_time(C.oid) as last_analyze, pg_stat_get_last_autoanalyze_time(C.oid) as last_autoanalyze, pg_stat_get_last_vacuum_index_scans(C.oid)AS last_vacuum_index_scans, + pg_stat_get_last_vacuum_truncated(C.oid) AS last_vacuum_truncated, + pg_stat_get_last_vacuum_untruncated(C.oid) AS last_vacuum_untruncated, pg_stat_get_last_vacuum_oldest_xmin(C.oid)AS last_vacuum_oldest_xmin, pg_stat_get_last_vacuum_status(C.oid) ASlast_vacuum_status, pg_stat_get_last_vacuum_status_detail(C.oid) AS last_vacuum_status_detail, diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index cf0bca7..cf754f9 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -1467,7 +1467,8 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params) if (!onerel) { - pgstat_report_vacuum(relid, false, 0, 0, 0, InvalidTransactionId, + pgstat_report_vacuum(relid, false, + 0, 0, 0, 0, 0,InvalidTransactionId, PGSTAT_VACUUM_SKIPPED, 0, PGSTAT_VACUUM_LOCK_FAILED); PopActiveSnapshot(); @@ -1504,7 +1505,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params) pgstat_report_vacuum(RelationGetRelid(onerel), onerel->rd_rel->relisshared, - 0, 0, 0, InvalidTransactionId, + 0, 0, 0, 0, 0, InvalidTransactionId, PGSTAT_VACUUM_FINISHED, 0, PGSTAT_VACUUM_NONTARGET); @@ -1528,7 +1529,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params) pgstat_report_vacuum(RelationGetRelid(onerel), onerel->rd_rel->relisshared, - 0, 0, 0, InvalidTransactionId, + 0, 0, 0, 0, 0, InvalidTransactionId, PGSTAT_VACUUM_FINISHED, 0, PGSTAT_VACUUM_NONTARGET); @@ -1550,7 +1551,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params) pgstat_report_vacuum(RelationGetRelid(onerel), onerel->rd_rel->relisshared, - 0, 0, 0, InvalidTransactionId, + 0, 0, 0, 0, 0, InvalidTransactionId, PGSTAT_VACUUM_FINISHED, 0, PGSTAT_VACUUM_NONTARGET); @@ -1570,7 +1571,7 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params) pgstat_report_vacuum(RelationGetRelid(onerel), onerel->rd_rel->relisshared, - 0, 0, 0, InvalidTransactionId, + 0, 0, 0, 0, 0, InvalidTransactionId, PGSTAT_VACUUM_FINISHED, 0, PGSTAT_VACUUM_NONTARGET); @@ -1628,7 +1629,8 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params) /* VACUUM FULLis now a variant of CLUSTER; see cluster.c */ cluster_rel(relid, InvalidOid, false, (options& VACOPT_VERBOSE) != 0); - pgstat_report_vacuum(relid, isshared, 0, 0, 0, InvalidTransactionId, + pgstat_report_vacuum(relid, isshared, 0, 0, 0, 0, 0, + InvalidTransactionId, PGSTAT_VACUUM_FINISHED, 0, PGSTAT_VACUUM_FULL); } diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c index c53b4fa..53821f3 100644 --- a/src/backend/commands/vacuumlazy.c +++ b/src/backend/commands/vacuumlazy.c @@ -121,6 +121,7 @@ typedef struct LVRelStats double new_rel_tuples; /* new estimated total # of tuples */ double new_dead_tuples; /* new estimated total # of dead tuples */ BlockNumber pages_removed; + BlockNumber pages_not_removed; double tuples_deleted; BlockNumber nonempty_pages; /* actually, last nonemptypage + 1 */ /* List of TIDs of tuples we intend to delete */ @@ -248,6 +249,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params, vacrelstats->old_rel_tuples =onerel->rd_rel->reltuples; vacrelstats->num_index_scans = 0; vacrelstats->pages_removed = 0; + vacrelstats->pages_not_removed = 0; vacrelstats->lock_waiter_detected = false; vacrelstats->aggressive = aggressive; @@ -290,8 +292,13 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params, { lazy_truncate_heap(onerel,vacrelstats); + /* just paranoia */ + if (vacrelstats->rel_pages >= vacrelstats->nonempty_pages) + vacrelstats->pages_not_removed += + vacrelstats->rel_pages - vacrelstats->nonempty_pages; + /* check if all empty pages are truncated */ - if (vacrelstats->rel_pages > vacrelstats->nonempty_pages) + if (vacrelstats->pages_not_removed > 0) vacuum_status_details |= PGSTAT_VACUUM_PARTIALLY_TRUNCATED; } @@ -363,6 +370,8 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params, onerel->rd_rel->relisshared, new_live_tuples, vacrelstats->new_dead_tuples, + vacrelstats->pages_removed, + vacrelstats->pages_not_removed, vacrelstats->num_index_scans, OldestXmin, PGSTAT_VACUUM_FINISHED, 0, @@ -2283,6 +2292,8 @@ lazy_vacuum_cancel_handler(void) stats->shared, stats->new_rel_tuples, stats->new_dead_tuples, + stats->pages_removed, + stats->pages_not_removed, stats->num_index_scans, OldestXmin, status, phase, details); diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index 3e1d051..a4a6169 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -1404,8 +1404,10 @@ pgstat_report_autovac(Oid dboid)voidpgstat_report_vacuum(Oid tableoid, bool shared, PgStat_Counter livetuples, PgStat_Counter deadtuples, + PgStat_Counter pages_removed, + PgStat_Counter pages_not_removed, PgStat_Counter num_index_scans, - TransactionId oldestxmin, + TransactionId oldestxmin, PgStat_Counter status, PgStat_Counter last_phase, PgStat_Counter details){ @@ -1421,6 +1423,8 @@ pgstat_report_vacuum(Oid tableoid, bool shared, msg.m_vacuumtime = GetCurrentTimestamp(); msg.m_live_tuples= livetuples; msg.m_dead_tuples = deadtuples; + msg.m_pages_removed = pages_removed; + msg.m_pages_not_removed = pages_not_removed; msg.m_num_index_scans = num_index_scans; msg.m_oldest_xmin = oldestxmin; msg.m_vacuum_status = status; @@ -4594,6 +4598,8 @@ pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry, Oid tableoid, bool create) result->n_live_tuples= 0; result->n_dead_tuples = 0; result->changes_since_analyze = 0; + result->n_pages_removed = 0; + result->n_pages_not_removed = 0; result->n_index_scans = 0; result->oldest_xmin = InvalidTransactionId; result->blocks_fetched = 0; @@ -6009,6 +6015,8 @@ pgstat_recv_vacuum(PgStat_MsgVacuum *msg, int len) case PGSTAT_VACUUM_FINISHED: tabentry->n_live_tuples= msg->m_live_tuples; tabentry->n_dead_tuples = msg->m_dead_tuples; + tabentry->n_pages_removed = msg->m_pages_removed; + tabentry->n_pages_not_removed = msg->m_pages_not_removed; tabentry->n_index_scans = msg->m_num_index_scans; tabentry->oldest_xmin = msg->m_oldest_xmin; tabentry->vacuum_failcount = 0; diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c index 0fba265..b32bdf5 100644 --- a/src/backend/utils/adt/pgstatfuncs.c +++ b/src/backend/utils/adt/pgstatfuncs.c @@ -204,6 +204,36 @@ pg_stat_get_vacuum_necessity(PG_FUNCTION_ARGS)}Datum +pg_stat_get_last_vacuum_truncated(PG_FUNCTION_ARGS) +{ + Oid relid = PG_GETARG_OID(0); + int64 result; + PgStat_StatTabEntry *tabentry; + + if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL) + result = 0; + else + result = (int64) (tabentry->n_pages_removed); + + PG_RETURN_INT64(result); +} + +Datum +pg_stat_get_last_vacuum_untruncated(PG_FUNCTION_ARGS) +{ + Oid relid = PG_GETARG_OID(0); + int64 result; + PgStat_StatTabEntry *tabentry; + + if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL) + result = 0; + else + result = (int64) (tabentry->n_pages_not_removed); + + PG_RETURN_INT64(result); +} + +Datumpg_stat_get_last_vacuum_index_scans(PG_FUNCTION_ARGS){ Oid relid = PG_GETARG_OID(0); @@ -349,7 +379,7 @@ pg_stat_get_last_vacuum_status_detail(PG_FUNCTION_ARGS) break; default: - result = "unknwon error"; + result = "unknown status"; break; } } diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index 48e6942..da2e9b4 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -2889,6 +2889,10 @@ DATA(insert OID = 6118 ( pg_stat_get_subscription PGNSP PGUID 12 1 0 0 0 f f fDESCR("statistics:information about subscription");DATA(insert OID = 2579 ( pg_stat_get_vacuum_necessity PGNSP PGUID12 1 0 0 0 f f f f t f s r 1 0 25 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_vacuum_necessity _null_ _null__null_ ));DESCR("statistics: true if needs vacuum"); +DATA(insert OID = 3422 ( pg_stat_get_last_vacuum_untruncated PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null__null_ _null_ _null_ _null_ pg_stat_get_last_vacuum_untruncated _null_ _null_ _null_ )); +DESCR("statistics: pages left untruncated in the last vacuum"); +DATA(insert OID = 3423 ( pg_stat_get_last_vacuum_truncated PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null__null_ _null_ _null_ _null_ pg_stat_get_last_vacuum_truncated _null_ _null_ _null_ )); +DESCR("statistics: pages truncated in the last vacuum");DATA(insert OID = 3424 ( pg_stat_get_last_vacuum_oldest_xmin PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 28 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_last_vacuum_oldest_xmin_null_ _null_ _null_ ));DESCR("statistics: The oldest xmin used in the last vacuum");DATA(insertOID = 3281 ( pg_stat_get_last_vacuum_index_scans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 23 "26"_null_ _null_ _null_ _null_ _null_ pg_stat_get_last_vacuum_index_scans _null_ _null_ _null_ )); diff --git a/src/include/pgstat.h b/src/include/pgstat.h index e18a630..6079661 100644 --- a/src/include/pgstat.h +++ b/src/include/pgstat.h @@ -389,6 +389,8 @@ typedef struct PgStat_MsgVacuum TimestampTz m_vacuumtime; PgStat_Counter m_live_tuples; PgStat_Counterm_dead_tuples; + PgStat_Counter m_pages_removed; + PgStat_Counter m_pages_not_removed; PgStat_Counter m_num_index_scans; TransactionId m_oldest_xmin; PgStat_Counterm_vacuum_status; @@ -654,6 +656,8 @@ typedef struct PgStat_StatTabEntry PgStat_Counter n_live_tuples; PgStat_Counter n_dead_tuples; PgStat_Counter changes_since_analyze; + PgStat_Counter n_pages_removed; + PgStat_Counter n_pages_not_removed; PgStat_Counter n_index_scans; TransactionId oldest_xmin; @@ -1197,6 +1201,8 @@ extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type textern void pgstat_report_autovac(Oiddboid);extern void pgstat_report_vacuum(Oid tableoid, bool shared, PgStat_Counterlivetuples, PgStat_Counter deadtuples, + PgStat_Counter pages_removed, + PgStat_Counter pages_not_removed, PgStat_Counter num_index_scans, TransactionId oldextxmin, PgStat_Counter status, PgStat_Counter last_phase, diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out index 18a122a..111d44f 100644 --- a/src/test/regress/expected/rules.out +++ b/src/test/regress/expected/rules.out @@ -1765,6 +1765,8 @@ pg_stat_all_tables| SELECT c.oid AS relid, pg_stat_get_last_analyze_time(c.oid) AS last_analyze, pg_stat_get_last_autoanalyze_time(c.oid) AS last_autoanalyze, pg_stat_get_last_vacuum_index_scans(c.oid)AS last_vacuum_index_scans, + pg_stat_get_last_vacuum_truncated(c.oid) AS last_vacuum_truncated, + pg_stat_get_last_vacuum_untruncated(c.oid) AS last_vacuum_untruncated, pg_stat_get_last_vacuum_oldest_xmin(c.oid)AS last_vacuum_oldest_xmin, pg_stat_get_last_vacuum_status(c.oid) AS last_vacuum_status, pg_stat_get_last_vacuum_status_detail(c.oid) AS last_vacuum_status_detail, @@ -1918,6 +1920,8 @@ pg_stat_sys_tables| SELECT pg_stat_all_tables.relid, pg_stat_all_tables.last_analyze, pg_stat_all_tables.last_autoanalyze, pg_stat_all_tables.last_vacuum_index_scans, + pg_stat_all_tables.last_vacuum_truncated, + pg_stat_all_tables.last_vacuum_untruncated, pg_stat_all_tables.last_vacuum_oldest_xmin, pg_stat_all_tables.last_vacuum_status, pg_stat_all_tables.last_vacuum_status_detail, @@ -1967,6 +1971,8 @@ pg_stat_user_tables| SELECT pg_stat_all_tables.relid, pg_stat_all_tables.last_analyze, pg_stat_all_tables.last_autoanalyze, pg_stat_all_tables.last_vacuum_index_scans, + pg_stat_all_tables.last_vacuum_truncated, + pg_stat_all_tables.last_vacuum_untruncated, pg_stat_all_tables.last_vacuum_oldest_xmin, pg_stat_all_tables.last_vacuum_status, pg_stat_all_tables.last_vacuum_status_detail, -- 2.9.2 From 6b83b307b0198c3902fc1e30944e02739c0a19cd Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp> Date: Tue, 21 Nov 2017 09:57:51 +0900 Subject: [PATCH 3/4] Add vacuum execution status in pg_stat_all_tables The main objective of this patch is showing how vacuuming is failing. It is sometimes very hard to diagnose since autovacuum stops silently in most cases. This patch leaves the reason for vacuum failure in pg_stat_all_tables and how many times it is continuing to fail. ---doc/src/sgml/monitoring.sgml | 20 +++++src/backend/catalog/system_views.sql | 4 +src/backend/commands/vacuum.c | 40 +++++++++src/backend/commands/vacuumlazy.c | 93 ++++++++++++++++++++-src/backend/postmaster/pgstat.c | 62 +++++++++++---src/backend/utils/adt/pgstatfuncs.c | 157 +++++++++++++++++++++++++++++++++++src/include/catalog/pg_proc.h | 8 ++src/include/commands/vacuum.h | 1+src/include/pgstat.h | 34 +++++++-src/test/regress/expected/rules.out | 12 +++10 files changed, 416insertions(+), 15 deletions(-) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 98c5f41..a0288cb 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -2580,6 +2580,21 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i <entry>Numberof splitted index scans performed during the last vacuum or autovacuum on this table</entry> </row> <row> + <entry><structfield>last_vacuum_oldest_xmin</structfield></entry> + <entry><type>bigint</type></entry> + <entry>Oldest xmin used by the last vacuum on this table</entry> + </row> + <row> + <entry><structfield>last_vacuum_status</structfield></entry> + <entry><type>text</type></entry> + <entry>Status of the last vacuum or autovacuum.</entry> + </row> + <row> + <entry><structfield>last_vacuum_status_detail</structfield></entry> + <entry><type>text</type></entry> + <entry>Details of the status of the last vacuum or autovacuum.</entry> + </row> + <row> <entry><structfield>vacuum_count</structfield></entry> <entry><type>bigint</type></entry> <entry>Numberof times this table has been manually vacuumed @@ -2592,6 +2607,11 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i daemon</entry> </row> <row> + <entry><structfield>vacuum_fail_count</structfield></entry> + <entry><type>integer</type></entry> + <entry>Number of continuously failed vacuum and autovacuum trials. Cleared to zero on completetion.</entry> + </row> + <row> <entry><structfield>analyze_count</structfield></entry> <entry><type>bigint</type></entry> <entry>Numberof times this table has been manually analyzed</entry> diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index b553bf4..c69fea9 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -529,8 +529,12 @@ CREATE VIEW pg_stat_all_tables AS pg_stat_get_last_analyze_time(C.oid) as last_analyze, pg_stat_get_last_autoanalyze_time(C.oid) as last_autoanalyze, pg_stat_get_last_vacuum_index_scans(C.oid)AS last_vacuum_index_scans, + pg_stat_get_last_vacuum_oldest_xmin(C.oid) AS last_vacuum_oldest_xmin, + pg_stat_get_last_vacuum_status(C.oid) AS last_vacuum_status, + pg_stat_get_last_vacuum_status_detail(C.oid) AS last_vacuum_status_detail, pg_stat_get_vacuum_count(C.oid)AS vacuum_count, pg_stat_get_autovacuum_count(C.oid) AS autovacuum_count, + pg_stat_get_vacuum_fail_count(C.oid) AS vacuum_fail_count, pg_stat_get_analyze_count(C.oid) AS analyze_count, pg_stat_get_autoanalyze_count(C.oid) AS autoanalyze_count FROM pg_class C LEFT JOIN diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index f51dcdb..cf0bca7 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -35,6 +35,7 @@#include "catalog/pg_inherits_fn.h"#include "catalog/pg_namespace.h"#include "commands/cluster.h" +#include "commands/progress.h"#include "commands/vacuum.h"#include "miscadmin.h"#include "nodes/makefuncs.h" @@ -367,6 +368,9 @@ vacuum(int options, List *relations, VacuumParams *params, } PG_CATCH(); { + /* report the final status of this vacuum */ + lazy_vacuum_cancel_handler(); + in_vacuum = false; VacuumCostActive = false; PG_RE_THROW(); @@ -1463,6 +1467,9 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params) if (!onerel) { + pgstat_report_vacuum(relid, false, 0, 0, 0, InvalidTransactionId, + PGSTAT_VACUUM_SKIPPED, 0, + PGSTAT_VACUUM_LOCK_FAILED); PopActiveSnapshot(); CommitTransactionCommand(); return false; @@ -1494,6 +1501,13 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params) (errmsg("skipping\"%s\" --- only table or database owner can vacuum it", RelationGetRelationName(onerel)))); relation_close(onerel, lmode); + + pgstat_report_vacuum(RelationGetRelid(onerel), + onerel->rd_rel->relisshared, + 0, 0, 0, InvalidTransactionId, + PGSTAT_VACUUM_FINISHED, 0, + PGSTAT_VACUUM_NONTARGET); + PopActiveSnapshot(); CommitTransactionCommand(); return false; @@ -1511,6 +1525,13 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params) (errmsg("skipping\"%s\" --- cannot vacuum non-tables or special system tables", RelationGetRelationName(onerel)))); relation_close(onerel, lmode); + + pgstat_report_vacuum(RelationGetRelid(onerel), + onerel->rd_rel->relisshared, + 0, 0, 0, InvalidTransactionId, + PGSTAT_VACUUM_FINISHED, 0, + PGSTAT_VACUUM_NONTARGET); + PopActiveSnapshot(); CommitTransactionCommand(); return false; @@ -1526,6 +1547,13 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params) if (RELATION_IS_OTHER_TEMP(onerel)) { relation_close(onerel, lmode); + + pgstat_report_vacuum(RelationGetRelid(onerel), + onerel->rd_rel->relisshared, + 0, 0, 0, InvalidTransactionId, + PGSTAT_VACUUM_FINISHED, 0, + PGSTAT_VACUUM_NONTARGET); + PopActiveSnapshot(); CommitTransactionCommand(); return false; @@ -1539,6 +1567,13 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params) if (onerel->rd_rel->relkind== RELKIND_PARTITIONED_TABLE) { relation_close(onerel, lmode); + + pgstat_report_vacuum(RelationGetRelid(onerel), + onerel->rd_rel->relisshared, + 0, 0, 0, InvalidTransactionId, + PGSTAT_VACUUM_FINISHED, 0, + PGSTAT_VACUUM_NONTARGET); + PopActiveSnapshot(); CommitTransactionCommand(); /* It's OK to proceed with ANALYZE on this table*/ @@ -1584,6 +1619,8 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params) */ if (options& VACOPT_FULL) { + bool isshared = onerel->rd_rel->relisshared; + /* close relation before vacuuming, but hold lock until commit */ relation_close(onerel, NoLock); onerel = NULL; @@ -1591,6 +1628,9 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params) /* VACUUM FULLis now a variant of CLUSTER; see cluster.c */ cluster_rel(relid, InvalidOid, false, (options& VACOPT_VERBOSE) != 0); + pgstat_report_vacuum(relid, isshared, 0, 0, 0, InvalidTransactionId, + PGSTAT_VACUUM_FINISHED, 0, + PGSTAT_VACUUM_FULL); } else lazy_vacuum_rel(onerel, options, params, vac_strategy); diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c index 4274043..c53b4fa 100644 --- a/src/backend/commands/vacuumlazy.c +++ b/src/backend/commands/vacuumlazy.c @@ -105,6 +105,8 @@typedef struct LVRelStats{ + Oid reloid; /* oid of the target relation */ + bool shared; /* is shared relation? */ /* hasindex = true means two-pass strategy; false meansone-pass */ bool hasindex; /* Overall statistics about rel */ @@ -129,6 +131,7 @@ typedef struct LVRelStats int num_index_scans; TransactionId latestRemovedXid; bool lock_waiter_detected; + bool aggressive;} LVRelStats; @@ -138,6 +141,7 @@ static int elevel = -1;static TransactionId OldestXmin;static TransactionId FreezeLimit;static MultiXactIdMultiXactCutoff; +static LVRelStats *current_lvstats;static BufferAccessStrategy vac_strategy; @@ -201,6 +205,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params, double new_live_tuples; TransactionId new_frozen_xid; MultiXactId new_min_multi; + int vacuum_status_details = 0; Assert(params != NULL); @@ -216,6 +221,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params, else elevel = DEBUG2; + current_lvstats = NULL; pgstat_progress_start_command(PROGRESS_COMMAND_VACUUM, RelationGetRelid(onerel)); @@ -236,11 +242,20 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params, vacrelstats = (LVRelStats *)palloc0(sizeof(LVRelStats)); + vacrelstats->reloid = RelationGetRelid(onerel); + vacrelstats->shared = onerel->rd_rel->relisshared; vacrelstats->old_rel_pages = onerel->rd_rel->relpages; vacrelstats->old_rel_tuples= onerel->rd_rel->reltuples; vacrelstats->num_index_scans = 0; vacrelstats->pages_removed= 0; vacrelstats->lock_waiter_detected = false; + vacrelstats->aggressive = aggressive; + + /* + * Register current vacrelstats so that final status can be reported on + * interrupts + */ + current_lvstats = vacrelstats; /* Open all indexes of the relation */ vac_open_indexes(onerel, RowExclusiveLock,&nindexes, &Irel); @@ -272,8 +287,14 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params, * Optionally truncate the relation. */ if (should_attempt_truncation(vacrelstats)) + { lazy_truncate_heap(onerel, vacrelstats); + /* check if all empty pages are truncated */ + if (vacrelstats->rel_pages > vacrelstats->nonempty_pages) + vacuum_status_details |= PGSTAT_VACUUM_PARTIALLY_TRUNCATED; + } + /* Report that we are now doing final cleanup */ pgstat_progress_update_param(PROGRESS_VACUUM_PHASE, PROGRESS_VACUUM_PHASE_FINAL_CLEANUP); @@ -331,11 +352,22 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params, if (new_live_tuples < 0) new_live_tuples = 0; /* just in case */ - pgstat_report_vacuum(RelationGetRelid(onerel), + /* vacuum successfully finished. nothing to do on exit */ + current_lvstats = NULL; + + if (aggressive) + vacuum_status_details |= PGSTAT_VACUUM_AGGRESSIVE; + + vacuum_status_details |= PGSTAT_VACUUM_COMPLETE; + pgstat_report_vacuum(vacrelstats->reloid, onerel->rd_rel->relisshared, new_live_tuples, vacrelstats->new_dead_tuples, - vacrelstats->num_index_scans); + vacrelstats->num_index_scans, + OldestXmin, + PGSTAT_VACUUM_FINISHED, 0, + vacuum_status_details); + pgstat_progress_end_command(); /* and log the action if appropriate */ @@ -2198,3 +2230,60 @@ heap_page_is_all_visible(Relation rel, Buffer buf, return all_visible;} + +/* + * lazy_vacuum_cancel_handler - report interrupted vacuum status + */ +void +lazy_vacuum_cancel_handler(void) +{ + LVRelStats *stats = current_lvstats; + LocalPgBackendStatus *local_beentry; + PgBackendStatus *beentry; + int phase; + int status; + int details = 0; + + current_lvstats = NULL; + + /* we have nothing to report */ + if (!stats) + return; + + /* get vacuum progress stored in backend status */ + local_beentry = pgstat_fetch_stat_local_beentry(MyBackendId); + if (!local_beentry) + return; + + beentry = &local_beentry->backendStatus; + + Assert (beentry && beentry->st_progress_command == PROGRESS_COMMAND_VACUUM); + + phase = beentry->st_progress_param[PROGRESS_VACUUM_PHASE]; + + /* we can reach here both on interrupt and error */ + if (geterrcode() == ERRCODE_QUERY_CANCELED) + { + status = PGSTAT_VACUUM_CANCELED; + if (stats->aggressive) + details |= PGSTAT_VACUUM_AGGRESSIVE; + } + else + { + /* special case: details stores an sql error code */ + status = PGSTAT_VACUUM_ERROR; + details = geterrcode(); + } + + /* + * vacuum has been canceled, report stats numbers without normalization + * here. (But currently they are not used.) + */ + pgstat_report_vacuum(stats->reloid, + stats->shared, + stats->new_rel_tuples, + stats->new_dead_tuples, + stats->num_index_scans, + OldestXmin, + status, phase, details); +} diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index 5f3fdf6..3e1d051 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -1404,7 +1404,10 @@ pgstat_report_autovac(Oid dboid)voidpgstat_report_vacuum(Oid tableoid, bool shared, PgStat_Counter livetuples, PgStat_Counter deadtuples, - PgStat_Counter num_index_scans) + PgStat_Counter num_index_scans, + TransactionId oldestxmin, + PgStat_Counter status, PgStat_Counter last_phase, + PgStat_Counter details){ PgStat_MsgVacuum msg; @@ -1419,6 +1422,10 @@ pgstat_report_vacuum(Oid tableoid, bool shared, msg.m_live_tuples = livetuples; msg.m_dead_tuples= deadtuples; msg.m_num_index_scans = num_index_scans; + msg.m_oldest_xmin = oldestxmin; + msg.m_vacuum_status = status; + msg.m_vacuum_last_phase = last_phase; + msg.m_vacuum_details = details; pgstat_send(&msg, sizeof(msg));} @@ -4588,6 +4595,7 @@ pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry, Oid tableoid, bool create) result->n_dead_tuples= 0; result->changes_since_analyze = 0; result->n_index_scans = 0; + result->oldest_xmin = InvalidTransactionId; result->blocks_fetched = 0; result->blocks_hit = 0; result->vacuum_timestamp = 0; @@ -4598,6 +4606,11 @@ pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry, Oid tableoid, bool create) result->analyze_count= 0; result->autovac_analyze_timestamp = 0; result->autovac_analyze_count = 0; + + result->vacuum_status = 0; + result->vacuum_last_phase = 0; + result->vacuum_details = 0; + result->vacuum_failcount = 0; } return result; @@ -5982,19 +5995,44 @@ pgstat_recv_vacuum(PgStat_MsgVacuum *msg, int len) tabentry = pgstat_get_tab_entry(dbentry, msg->m_tableoid,true); - tabentry->n_live_tuples = msg->m_live_tuples; - tabentry->n_dead_tuples = msg->m_dead_tuples; - tabentry->n_index_scans = msg->m_num_index_scans; + tabentry->vacuum_status = msg->m_vacuum_status; + tabentry->vacuum_last_phase = msg->m_vacuum_last_phase; + tabentry->vacuum_details = msg->m_vacuum_details; - if (msg->m_autovacuum) - { - tabentry->autovac_vacuum_timestamp = msg->m_vacuumtime; - tabentry->autovac_vacuum_count++; - } - else + /* + * We store the numbers only when the vacuum has been completed. They + * might be usable to find how much the stopped vacuum processed but we + * choose not to show them rather than show bogus numbers. + */ + switch ((StatVacuumStatus)msg->m_vacuum_status) { - tabentry->vacuum_timestamp = msg->m_vacuumtime; - tabentry->vacuum_count++; + case PGSTAT_VACUUM_FINISHED: + tabentry->n_live_tuples = msg->m_live_tuples; + tabentry->n_dead_tuples = msg->m_dead_tuples; + tabentry->n_index_scans = msg->m_num_index_scans; + tabentry->oldest_xmin = msg->m_oldest_xmin; + tabentry->vacuum_failcount = 0; + + if (msg->m_autovacuum) + { + tabentry->autovac_vacuum_timestamp = msg->m_vacuumtime; + tabentry->autovac_vacuum_count++; + } + else + { + tabentry->vacuum_timestamp = msg->m_vacuumtime; + tabentry->vacuum_count++; + } + break; + + case PGSTAT_VACUUM_ERROR: + case PGSTAT_VACUUM_CANCELED: + case PGSTAT_VACUUM_SKIPPED: + tabentry->vacuum_failcount++; + break; + + default: + break; }} diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c index ab80794..0fba265 100644 --- a/src/backend/utils/adt/pgstatfuncs.c +++ b/src/backend/utils/adt/pgstatfuncs.c @@ -219,6 +219,163 @@ pg_stat_get_last_vacuum_index_scans(PG_FUNCTION_ARGS)}Datum +pg_stat_get_last_vacuum_oldest_xmin(PG_FUNCTION_ARGS) +{ + Oid relid = PG_GETARG_OID(0); + TransactionId result; + PgStat_StatTabEntry *tabentry; + + if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL) + result = InvalidTransactionId; + else + result = (int32) (tabentry->oldest_xmin); + + return TransactionIdGetDatum(result); +} + +Datum +pg_stat_get_last_vacuum_status(PG_FUNCTION_ARGS) +{ + Oid relid = PG_GETARG_OID(0); + char *result = "unknown"; + PgStat_StatTabEntry *tabentry; + + if ((tabentry = pgstat_fetch_stat_tabentry(relid)) != NULL) + { + StatVacuumStatus status; + + status = tabentry->vacuum_status; + switch (status) + { + case PGSTAT_VACUUM_FINISHED: + result = "completed"; + break; + case PGSTAT_VACUUM_ERROR: + result = "error"; + break; + case PGSTAT_VACUUM_CANCELED: + result = "canceled"; + break; + case PGSTAT_VACUUM_SKIPPED: + result = "skipped"; + break; + default: + result = psprintf("unknown status: %d", status); + break; + } + } + + PG_RETURN_TEXT_P(cstring_to_text(result)); +} + +Datum +pg_stat_get_last_vacuum_status_detail(PG_FUNCTION_ARGS) +{ + Oid relid = PG_GETARG_OID(0); + char *result = "unknown"; + PgStat_StatTabEntry *tabentry; + StringInfoData str; + + /* + * status string. this must be synced with the strings shown by the + * statistics view "pg_stat_progress_vacuum" + */ + static char *phasestr[] = + {"initialization", + "scanning heap", + "vacuuming indexes", + "vacuuming heap", + "cleaning up indexes", + "trucating heap", + "performing final cleanup"}; + static char *detailstr[] = + {NULL, /* PGSTAT_VACUUM_COMPLETE */ + "aggressive", /* PGSTAT_VACUUM_AGGRESSIVE */ + "full", /* PGSTAT_VACUUM_FULL */ + "lock failure", /* PGSTAT_VACUUM_LOCK_FAILED */ + "not a target", /* PGSTAT_VACUUM_NONTARGET */ + "partially truncated" /* PGSTAT_VACUUM_PARTIALLY_TRUNCATED */ + }; + + if ((tabentry = pgstat_fetch_stat_tabentry(relid)) != NULL) + { + int phase; + StatVacuumStatus status; + bool first = true; + int i; + + initStringInfo(&str); + + status = tabentry->vacuum_status; + switch (status) + { + case PGSTAT_VACUUM_ERROR: + /* details is storing an sql error code */ + appendStringInfoString( + &str, + format_elog_string( + "sqlcode: %s, ", + unpack_sql_state((int)tabentry->vacuum_details))); + + /* FALL THROUGH */ + + case PGSTAT_VACUUM_CANCELED: + phase = tabentry->vacuum_last_phase; + /* number of elements of phasestr above */ + if (phase >= 0 && phase <= 7) + appendStringInfoString(&str, phasestr[phase]); + + result = str.data; + break; + + case PGSTAT_VACUUM_FINISHED: + case PGSTAT_VACUUM_SKIPPED: + for (i = 0 ; i < PGSTAT_VACUUM_NDETAILS ; i++) + { + if ((tabentry->vacuum_details & (1 << i)) == 0) + continue; + + if (detailstr[i] == NULL) + continue; + + if (first) + first = false; + else + appendStringInfoString(&str, ", "); + + appendStringInfoString(&str, detailstr[i]); + } + result = str.data; + break; + + default: + result = "unknwon error"; + break; + } + } + + if (result == NULL) + PG_RETURN_NULL(); + + PG_RETURN_TEXT_P(cstring_to_text(result)); +} + +Datum +pg_stat_get_vacuum_fail_count(PG_FUNCTION_ARGS) +{ + Oid relid = PG_GETARG_OID(0); + int64 result; + PgStat_StatTabEntry *tabentry; + + if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL) + result = 0; + else + result = (int32) (tabentry->vacuum_failcount); + + PG_RETURN_INT32(result); +} + +Datumpg_stat_get_blocks_fetched(PG_FUNCTION_ARGS){ Oid relid = PG_GETARG_OID(0); diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index 6b84c9a..48e6942 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -2889,8 +2889,16 @@ DATA(insert OID = 6118 ( pg_stat_get_subscription PGNSP PGUID 12 1 0 0 0 f f fDESCR("statistics:information about subscription");DATA(insert OID = 2579 ( pg_stat_get_vacuum_necessity PGNSP PGUID12 1 0 0 0 f f f f t f s r 1 0 25 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_vacuum_necessity _null_ _null__null_ ));DESCR("statistics: true if needs vacuum"); +DATA(insert OID = 3424 ( pg_stat_get_last_vacuum_oldest_xmin PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 28 "26" _null__null_ _null_ _null_ _null_ pg_stat_get_last_vacuum_oldest_xmin _null_ _null_ _null_ )); +DESCR("statistics: The oldest xmin used in the last vacuum");DATA(insert OID = 3281 ( pg_stat_get_last_vacuum_index_scans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 23 "26" _null_ _null_ _null_ _null_ _null_pg_stat_get_last_vacuum_index_scans _null_ _null_ _null_ ));DESCR("statistics: number of index scans in the last vacuum"); +DATA(insert OID = 3419 ( pg_stat_get_last_vacuum_status PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 25 "26" _null_ _null__null_ _null_ _null_ pg_stat_get_last_vacuum_status _null_ _null_ _null_ )); +DESCR("statistics: ending status of the last vacuum"); +DATA(insert OID = 3420 ( pg_stat_get_last_vacuum_status_detail PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 25 "26" _null__null_ _null_ _null_ _null_ pg_stat_get_last_vacuum_status_detail _null_ _null_ _null_ )); +DESCR("statistics: ending status of the last vacuum"); +DATA(insert OID = 3421 ( pg_stat_get_vacuum_fail_count PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 23 "26" _null_ _null__null_ _null_ _null_ pg_stat_get_vacuum_fail_count _null_ _null_ _null_ )); +DESCR("statistics: number of successively failed vacuum trials");DATA(insert OID = 2026 ( pg_backend_pid PGNSP PGUID 12 1 0 0 0 f f f f t f s r 0 0 23 "" _null_ _null_ _null_ _null_ _null_ pg_backend_pid _null_ _null_ _null_));DESCR("statistics: current backend PID");DATA(insert OID = 1937 ( pg_stat_get_backend_pid PGNSP PGUID 121 0 0 0 f f f f t f s r 1 0 23 "23" _null_ _null_ _null_ _null_ _null_ pg_stat_get_backend_pid _null_ _null_ _null_ )); diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h index 84bec74..da3107a 100644 --- a/src/include/commands/vacuum.h +++ b/src/include/commands/vacuum.h @@ -190,6 +190,7 @@ extern void vacuum_delay_point(void);/* in commands/vacuumlazy.c */extern void lazy_vacuum_rel(Relationonerel, int options, VacuumParams *params, BufferAccessStrategy bstrategy); +extern void lazy_vacuum_cancel_handler(void);/* in commands/analyze.c */extern void analyze_rel(Oid relid, RangeVar *relation,int options, diff --git a/src/include/pgstat.h b/src/include/pgstat.h index 3ab5f4a..e18a630 100644 --- a/src/include/pgstat.h +++ b/src/include/pgstat.h @@ -67,6 +67,26 @@ typedef enum StatMsgType PGSTAT_MTYPE_DEADLOCK} StatMsgType; +/* + * The exit status stored in vacuum report. + */ +typedef enum StatVacuumStatus +{ + PGSTAT_VACUUM_FINISHED, + PGSTAT_VACUUM_CANCELED, + PGSTAT_VACUUM_ERROR, + PGSTAT_VACUUM_SKIPPED +} StatVacuumStatus; + +/* bitmap for vacuum status details, except for PGSTAT_VACUUM_ERROR_WITH_CODE */ +#define PGSTAT_VACUUM_COMPLETE (1 << 0) +#define PGSTAT_VACUUM_AGGRESSIVE (1 << 1) +#define PGSTAT_VACUUM_FULL (1 << 2) +#define PGSTAT_VACUUM_LOCK_FAILED (1 << 3) +#define PGSTAT_VACUUM_NONTARGET (1 << 4) +#define PGSTAT_VACUUM_PARTIALLY_TRUNCATED (1 << 5) +#define PGSTAT_VACUUM_NDETAILS 6 +/* ---------- * The data type used for counters. * ---------- @@ -370,6 +390,10 @@ typedef struct PgStat_MsgVacuum PgStat_Counter m_live_tuples; PgStat_Counter m_dead_tuples; PgStat_Counter m_num_index_scans; + TransactionId m_oldest_xmin; + PgStat_Counter m_vacuum_status; + PgStat_Counter m_vacuum_last_phase; + PgStat_Counter m_vacuum_details;} PgStat_MsgVacuum; @@ -631,6 +655,7 @@ typedef struct PgStat_StatTabEntry PgStat_Counter n_dead_tuples; PgStat_Counter changes_since_analyze; PgStat_Counter n_index_scans; + TransactionId oldest_xmin; PgStat_Counter blocks_fetched; PgStat_Counter blocks_hit; @@ -643,6 +668,10 @@ typedef struct PgStat_StatTabEntry PgStat_Counter analyze_count; TimestampTz autovac_analyze_timestamp; /* autovacuum initiated */ PgStat_Counter autovac_analyze_count; + PgStat_Counter vacuum_status; + PgStat_Counter vacuum_last_phase; + PgStat_Counter vacuum_details; + PgStat_Counter vacuum_failcount;} PgStat_StatTabEntry; @@ -1168,7 +1197,10 @@ extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type textern void pgstat_report_autovac(Oiddboid);extern void pgstat_report_vacuum(Oid tableoid, bool shared, PgStat_Counterlivetuples, PgStat_Counter deadtuples, - PgStat_Counter num_index_scans); + PgStat_Counter num_index_scans, + TransactionId oldextxmin, + PgStat_Counter status, PgStat_Counter last_phase, + PgStat_Counter detail);extern void pgstat_report_analyze(Relation rel, PgStat_Counterlivetuples, PgStat_Counter deadtuples, bool resetcounter); diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out index e827842..18a122a 100644 --- a/src/test/regress/expected/rules.out +++ b/src/test/regress/expected/rules.out @@ -1765,8 +1765,12 @@ pg_stat_all_tables| SELECT c.oid AS relid, pg_stat_get_last_analyze_time(c.oid) AS last_analyze, pg_stat_get_last_autoanalyze_time(c.oid) AS last_autoanalyze, pg_stat_get_last_vacuum_index_scans(c.oid)AS last_vacuum_index_scans, + pg_stat_get_last_vacuum_oldest_xmin(c.oid) AS last_vacuum_oldest_xmin, + pg_stat_get_last_vacuum_status(c.oid) AS last_vacuum_status, + pg_stat_get_last_vacuum_status_detail(c.oid) AS last_vacuum_status_detail, pg_stat_get_vacuum_count(c.oid) AS vacuum_count, pg_stat_get_autovacuum_count(c.oid) AS autovacuum_count, + pg_stat_get_vacuum_fail_count(c.oid) AS vacuum_fail_count, pg_stat_get_analyze_count(c.oid) AS analyze_count, pg_stat_get_autoanalyze_count(c.oid) AS autoanalyze_count FROM ((pg_class c @@ -1914,8 +1918,12 @@ pg_stat_sys_tables| SELECT pg_stat_all_tables.relid, pg_stat_all_tables.last_analyze, pg_stat_all_tables.last_autoanalyze, pg_stat_all_tables.last_vacuum_index_scans, + pg_stat_all_tables.last_vacuum_oldest_xmin, + pg_stat_all_tables.last_vacuum_status, + pg_stat_all_tables.last_vacuum_status_detail, pg_stat_all_tables.vacuum_count, pg_stat_all_tables.autovacuum_count, + pg_stat_all_tables.vacuum_fail_count, pg_stat_all_tables.analyze_count, pg_stat_all_tables.autoanalyze_count FROM pg_stat_all_tables @@ -1959,8 +1967,12 @@ pg_stat_user_tables| SELECT pg_stat_all_tables.relid, pg_stat_all_tables.last_analyze, pg_stat_all_tables.last_autoanalyze, pg_stat_all_tables.last_vacuum_index_scans, + pg_stat_all_tables.last_vacuum_oldest_xmin, + pg_stat_all_tables.last_vacuum_status, + pg_stat_all_tables.last_vacuum_status_detail, pg_stat_all_tables.vacuum_count, pg_stat_all_tables.autovacuum_count, + pg_stat_all_tables.vacuum_fail_count, pg_stat_all_tables.analyze_count, pg_stat_all_tables.autoanalyze_count FROM pg_stat_all_tables -- 2.9.2 From 017486cfe6231ed43d8ebb9d397f2699840d27c5 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp> Date: Thu, 16 Nov 2017 16:18:54 +0900 Subject: [PATCH 2/4] Add vacuum_required to pg_stat_all_tables If vacuum of a table has been failed for a long time for some reasons, it is hard for uses to distinguish between that the server judged vacuuming of the table is not required and that rquired but failed. This offers convenient way to check that as the first step of trouble shooting. ---doc/src/sgml/config.sgml | 5 +-doc/src/sgml/maintenance.sgml | 4 +-doc/src/sgml/monitoring.sgml | 5 ++src/backend/catalog/system_views.sql | 1 +src/backend/commands/cluster.c | 2 +-src/backend/commands/vacuum.c | 69 ++++++++++++++++++---src/backend/commands/vacuumlazy.c | 14 +----src/backend/postmaster/autovacuum.c | 115 +++++++++++++++++++++++++++++++++++src/backend/utils/adt/pgstatfuncs.c | 9 +++src/include/catalog/pg_proc.h | 2 +src/include/commands/vacuum.h | 3 +-src/include/postmaster/autovacuum.h | 1 +src/test/regress/expected/rules.out | 3 +13 files changed, 210 insertions(+), 23 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index b51d219..5bf0b33 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -6579,7 +6579,10 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; <para> <command>VACUUM</command>performs an aggressive scan if the table's <structname>pg_class</structname>.<structfield>relfrozenxid</structfield>field has reached - the age specified by this setting. An aggressive scan differs from + the age specified by this setting. It is indicated + as <quote>aggressive</quote> in vacuum_required + of <xref linkend="pg-stat-all-tables-view">. An aggressive scan + differs from a regular <command>VACUUM</command> in that it visits every page that might contain unfrozenXIDs or MXIDs, not just those that might contain dead tuples. The default is 150 million transactions. Althoughusers can diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml index 1a37905..d045b09 100644 --- a/doc/src/sgml/maintenance.sgml +++ b/doc/src/sgml/maintenance.sgml @@ -514,7 +514,9 @@ <varname>autovacuum_freeze_max_age</varname> wouldn't make sense because an anti-wraparound autovacuumwould be triggered at that point anyway, and the 0.95 multiplier leaves some breathing room to run a manual - <command>VACUUM</command> before that happens. As a rule of thumb, + <command>VACUUM</command> before that happens. It is indicated + as <quote>close to freeze-limit xid</quote> in vacuum_required + of <xref linkend="pg-stat-all-tables-view">. As a rule of thumb, <command>vacuum_freeze_table_age</command> shouldbe set to a value somewhat below <varname>autovacuum_freeze_max_age</varname>, leaving enough gap so that a regularlyscheduled <command>VACUUM</command> or an autovacuum triggered by diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 6a57688..98c5f41 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -2547,6 +2547,11 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i <entry>Estimatednumber of rows modified since this table was last analyzed</entry> </row> <row> + <entry><structfield>vacuum_required</structfield></entry> + <entry><type>text</type></entry> + <entry>Vacuum requirement status. "partial", "aggressive", "required", "not requried" or "close to freeze-limit xid".</entry> + </row> + <row> <entry><structfield>last_vacuum</structfield></entry> <entry><type>timestamp with time zone</type></entry> <entry>Last time at which this table was manually vacuumed diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index aeba9d5..b553bf4 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -523,6 +523,7 @@ CREATE VIEW pg_stat_all_tables AS pg_stat_get_live_tuples(C.oid) AS n_live_tup, pg_stat_get_dead_tuples(C.oid) AS n_dead_tup, pg_stat_get_mod_since_analyze(C.oid) AS n_mod_since_analyze, + pg_stat_get_vacuum_necessity(C.oid) AS vacuum_required, pg_stat_get_last_vacuum_time(C.oid) as last_vacuum, pg_stat_get_last_autovacuum_time(C.oid) as last_autovacuum, pg_stat_get_last_analyze_time(C.oid)as last_analyze, diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c index 48f1e6e..403b76d 100644 --- a/src/backend/commands/cluster.c +++ b/src/backend/commands/cluster.c @@ -850,7 +850,7 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose, */ vacuum_set_xid_limits(OldHeap,0, 0, 0, 0, &OldestXmin, &FreezeXid, NULL, &MultiXactCutoff, - NULL); + NULL, NULL, NULL); /* * FreezeXid will become the table's new relfrozenxid, and that mustn'tgo diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index cbd6e9b..f51dcdb 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -585,6 +585,10 @@ get_all_vacuum_rels(void) * Xmax. * - mxactFullScanLimit is a value against which a table's relminmxidvalue is * compared to produce a full-table vacuum, as with xidFullScanLimit. + * - aggressive is set if it is not NULL and set true if the table needs + * aggressive scan. + * - close_to_wrap_around_limit is set if it is not NULL and set true if it is + * in anti-anti-wraparound window. * * xidFullScanLimit and mxactFullScanLimit can be passed as NULL if caller is * notinterested. @@ -599,9 +603,11 @@ vacuum_set_xid_limits(Relation rel, TransactionId *freezeLimit, TransactionId *xidFullScanLimit, MultiXactId *multiXactCutoff, - MultiXactId *mxactFullScanLimit) + MultiXactId *mxactFullScanLimit, + bool *aggressive, bool *close_to_wrap_around_limit){ int freezemin; + int freezemax; int mxid_freezemin; int effective_multixact_freeze_max_age; TransactionId limit; @@ -701,11 +707,13 @@ vacuum_set_xid_limits(Relation rel, *multiXactCutoff = mxactLimit; - if (xidFullScanLimit != NULL) + if (xidFullScanLimit != NULL || aggressive != NULL) { int freezetable; + bool maybe_anti_wrapround = false; - Assert(mxactFullScanLimit != NULL); + /* these two output should be requested together */ + Assert(xidFullScanLimit == NULL || mxactFullScanLimit != NULL); /* * Determine the table freeze ageto use: as specified by the caller, @@ -717,7 +725,14 @@ vacuum_set_xid_limits(Relation rel, freezetable = freeze_table_age; if (freezetable <0) freezetable = vacuum_freeze_table_age; - freezetable = Min(freezetable, autovacuum_freeze_max_age * 0.95); + + freezemax = autovacuum_freeze_max_age * 0.95; + if (freezemax < freezetable) + { + /* We may be in anti-anti-warparound window */ + freezetable = freezemax; + maybe_anti_wrapround = true; + } Assert(freezetable >= 0); /* @@ -728,7 +743,8 @@ vacuum_set_xid_limits(Relation rel, if (!TransactionIdIsNormal(limit)) limit = FirstNormalTransactionId; - *xidFullScanLimit = limit; + if (xidFullScanLimit) + *xidFullScanLimit = limit; /* * Similar to the above, determine the table freeze age to use for @@ -741,10 +757,20 @@ vacuum_set_xid_limits(Relation rel, freezetable = multixact_freeze_table_age; if (freezetable< 0) freezetable = vacuum_multixact_freeze_table_age; - freezetable = Min(freezetable, - effective_multixact_freeze_max_age * 0.95); + + freezemax = effective_multixact_freeze_max_age * 0.95; + if (freezemax < freezetable) + { + /* We may be in anti-anti-warparound window */ + freezetable = freezemax; + maybe_anti_wrapround = true; + } Assert(freezetable >= 0); + /* We may be in anti-anti-warparound window */ + if (effective_multixact_freeze_max_age * 0.95 < freezetable) + maybe_anti_wrapround = true; + /* * Compute MultiXact limit causing a full-table vacuum, being careful * to generate a valid MultiXactvalue. @@ -753,11 +779,38 @@ vacuum_set_xid_limits(Relation rel, if (mxactLimit < FirstMultiXactId) mxactLimit= FirstMultiXactId; - *mxactFullScanLimit = mxactLimit; + if (mxactFullScanLimit) + *mxactFullScanLimit = mxactLimit; + + /* + * We request an aggressive scan if the table's frozen Xid is now + * older than or equal to the requested Xid full-table scan limit; or + * if the table's minimum MultiXactId is older than or equal to the + * requested mxid full-table scan limit. + */ + if (aggressive) + { + *aggressive = + TransactionIdPrecedesOrEquals(rel->rd_rel->relfrozenxid, + limit); + *aggressive |= + MultiXactIdPrecedesOrEquals(rel->rd_rel->relminmxid, + mxactLimit); + + /* set close_to_wrap_around_limit if requested */ + if (close_to_wrap_around_limit) + *close_to_wrap_around_limit = + (*aggressive && maybe_anti_wrapround); + } + else + { + Assert (!close_to_wrap_around_limit); + } } else { Assert(mxactFullScanLimit == NULL); + Assert(aggressive == NULL); }} diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c index c482c8e..4274043 100644 --- a/src/backend/commands/vacuumlazy.c +++ b/src/backend/commands/vacuumlazy.c @@ -227,18 +227,10 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params, params->multixact_freeze_min_age, params->multixact_freeze_table_age, &OldestXmin,&FreezeLimit, &xidFullScanLimit, - &MultiXactCutoff, &mxactFullScanLimit); + &MultiXactCutoff, &mxactFullScanLimit, + &aggressive, NULL); - /* - * We request an aggressive scan if the table's frozen Xid is now older - * than or equal to the requested Xid full-table scan limit; or if the - * table's minimum MultiXactId is older than or equal to the requested - * mxid full-table scan limit; or if DISABLE_PAGE_SKIPPING was specified. - */ - aggressive = TransactionIdPrecedesOrEquals(onerel->rd_rel->relfrozenxid, - xidFullScanLimit); - aggressive |= MultiXactIdPrecedesOrEquals(onerel->rd_rel->relminmxid, - mxactFullScanLimit); + /* force aggressive scan if DISABLE_PAGE_SKIPPING was specified */ if (options & VACOPT_DISABLE_PAGE_SKIPPING) aggressive = true; diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index 48765bb..abbf660 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -831,6 +831,121 @@ shutdown:}/* + * Returns status string of auto vacuum on the relation + */ +char * +AutoVacuumRequirement(Oid reloid) +{ + Relation classRel; + Relation rel; + TupleDesc pg_class_desc; + HeapTuple tuple; + Form_pg_class classForm; + AutoVacOpts *relopts; + PgStat_StatTabEntry *tabentry; + PgStat_StatDBEntry *shared; + PgStat_StatDBEntry *dbentry; + int effective_multixact_freeze_max_age; + bool dovacuum; + bool doanalyze; + bool wraparound; + bool aggressive; + bool xid_calculated = false; + bool in_anti_wa_window = false; + char *ret = "not requried"; + + /* Compute the multixact age for which freezing is urgent. */ + effective_multixact_freeze_max_age = MultiXactMemberFreezeThreshold(); + + /* Fetch the pgclass entry for this relation */ + tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(reloid)); + if (!HeapTupleIsValid(tuple)) + elog(ERROR, "cache lookup failed for relation %u", reloid); + classForm = (Form_pg_class) GETSTRUCT(tuple); + + /* extract relopts for autovacuum */ + classRel = heap_open(RelationRelationId, AccessShareLock); + pg_class_desc = RelationGetDescr(classRel); + relopts = extract_autovac_opts(tuple, pg_class_desc); + heap_close(classRel, AccessShareLock); + + /* Fetch the pgstat shared entry and entry for this database */ + shared = pgstat_fetch_stat_dbentry(InvalidOid); + dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId); + + /* Fetch the pgstat entry for this table */ + tabentry = get_pgstat_tabentry_relid(reloid, classForm->relisshared, + shared, dbentry); + + /* + * Check if the relation needs vacuum. This function is intended to + * suggest aggresive vacuum for the last 5% window in + * autovacuum_freeze_max_age so the variable wraparound is ignored + * here. See vacuum_set_xid_limits for details. + */ + relation_needs_vacanalyze(reloid, relopts, classForm, tabentry, + effective_multixact_freeze_max_age, + &dovacuum, &doanalyze, &wraparound); + ReleaseSysCache(tuple); + + /* get further information if needed */ + rel = NULL; + + /* don't get stuck with lock */ + if (ConditionalLockRelationOid(reloid, AccessShareLock)) + rel = try_relation_open(reloid, NoLock); + + if (rel) + { + TransactionId OldestXmin, FreezeLimit; + MultiXactId MultiXactCutoff; + + vacuum_set_xid_limits(rel, + vacuum_freeze_min_age, + vacuum_freeze_table_age, + vacuum_multixact_freeze_min_age, + vacuum_multixact_freeze_table_age, + &OldestXmin, &FreezeLimit, NULL, + &MultiXactCutoff, NULL, + &aggressive, &in_anti_wa_window); + + xid_calculated = true; + relation_close(rel, AccessShareLock); + } + + /* choose the proper message according to the calculation above */ + if (xid_calculated) + { + if (dovacuum) + { + /* we don't care anti-wraparound if autovacuum is on */ + if (aggressive) + ret = "aggressive"; + else + ret = "partial"; + } + else if (in_anti_wa_window) + ret = "close to freeze-limit xid"; + /* otherwise just "not requried" */ + } + else + { + /* + * failed to compute xid limits. show less-grained messages. We can + * use just "required" in the autovacuum case is enough to distinguish + * from full-grained messages, but we require additional words in the + * case where autovacuum is turned off. + */ + if (dovacuum) + ret = "required"; + else + ret = "not required (lock not acquired)"; + } + + return ret; +} + +/* * Determine the time to sleep, based on the database list. * * The "canlaunch" parameter indicates whether we can starta worker right now, diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c index 2956356..ab80794 100644 --- a/src/backend/utils/adt/pgstatfuncs.c +++ b/src/backend/utils/adt/pgstatfuncs.c @@ -23,6 +23,7 @@#include "pgstat.h"#include "postmaster/bgworker_internals.h"#include "postmaster/postmaster.h" +#include "postmaster/autovacuum.h"#include "storage/proc.h"#include "storage/procarray.h"#include "utils/acl.h" @@ -195,6 +196,14 @@ pg_stat_get_mod_since_analyze(PG_FUNCTION_ARGS)}Datum +pg_stat_get_vacuum_necessity(PG_FUNCTION_ARGS) +{ + Oid relid = PG_GETARG_OID(0); + + PG_RETURN_TEXT_P(cstring_to_text(AutoVacuumRequirement(relid))); +} + +Datumpg_stat_get_last_vacuum_index_scans(PG_FUNCTION_ARGS){ Oid relid = PG_GETARG_OID(0); diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index f3b606b..6b84c9a 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -2887,6 +2887,8 @@ DATA(insert OID = 3317 ( pg_stat_get_wal_receiver PGNSP PGUID 12 1 0 0 0 f f fDESCR("statistics:information about WAL receiver");DATA(insert OID = 6118 ( pg_stat_get_subscription PGNSP PGUID 121 0 0 0 f f f f f f s r 1 0 2249 "26" "{26,26,26,23,3220,1184,1184,3220,1184}" "{i,o,o,o,o,o,o,o,o}" "{subid,subid,relid,pid,received_lsn,last_msg_send_time,last_msg_receipt_time,latest_end_lsn,latest_end_time}"_null_ _null_pg_stat_get_subscription _null_ _null_ _null_ ));DESCR("statistics: information about subscription"); +DATA(insert OID = 2579 ( pg_stat_get_vacuum_necessity PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 25 "26" _null_ _null__null_ _null_ _null_ pg_stat_get_vacuum_necessity _null_ _null_ _null_ )); +DESCR("statistics: true if needs vacuum");DATA(insert OID = 3281 ( pg_stat_get_last_vacuum_index_scans PGNSP PGUID 121 0 0 0 f f f f t f s r 1 0 23 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_last_vacuum_index_scans _null_ _null__null_ ));DESCR("statistics: number of index scans in the last vacuum");DATA(insert OID = 2026 ( pg_backend_pid PGNSP PGUID 12 1 0 0 0 f f f f t f s r 0 0 23 "" _null_ _null_ _null_ _null_ _null_ pg_backend_pid _null_ _null__null_ )); diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h index 60586b2..84bec74 100644 --- a/src/include/commands/vacuum.h +++ b/src/include/commands/vacuum.h @@ -182,7 +182,8 @@ extern void vacuum_set_xid_limits(Relation rel, TransactionId *freezeLimit, TransactionId *xidFullScanLimit, MultiXactId *multiXactCutoff, - MultiXactId *mxactFullScanLimit); + MultiXactId *mxactFullScanLimit, + bool *aggressive, bool *in_wa_window);extern void vac_update_datfrozenxid(void);extern void vacuum_delay_point(void); diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h index 3469915..848a322 100644 --- a/src/include/postmaster/autovacuum.h +++ b/src/include/postmaster/autovacuum.h @@ -49,6 +49,7 @@ extern int Log_autovacuum_min_duration;extern bool AutoVacuumingActive(void);extern bool IsAutoVacuumLauncherProcess(void);externbool IsAutoVacuumWorkerProcess(void); +extern char *AutoVacuumRequirement(Oid reloid);#define IsAnyAutoVacuumProcess() \ (IsAutoVacuumLauncherProcess() || IsAutoVacuumWorkerProcess()) diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out index d0bb46c..e827842 100644 --- a/src/test/regress/expected/rules.out +++ b/src/test/regress/expected/rules.out @@ -1759,6 +1759,7 @@ pg_stat_all_tables| SELECT c.oid AS relid, pg_stat_get_live_tuples(c.oid) AS n_live_tup, pg_stat_get_dead_tuples(c.oid)AS n_dead_tup, pg_stat_get_mod_since_analyze(c.oid) AS n_mod_since_analyze, + pg_stat_get_vacuum_necessity(c.oid) AS vacuum_required, pg_stat_get_last_vacuum_time(c.oid) AS last_vacuum, pg_stat_get_last_autovacuum_time(c.oid)AS last_autovacuum, pg_stat_get_last_analyze_time(c.oid) AS last_analyze, @@ -1907,6 +1908,7 @@ pg_stat_sys_tables| SELECT pg_stat_all_tables.relid, pg_stat_all_tables.n_live_tup, pg_stat_all_tables.n_dead_tup, pg_stat_all_tables.n_mod_since_analyze, + pg_stat_all_tables.vacuum_required, pg_stat_all_tables.last_vacuum, pg_stat_all_tables.last_autovacuum, pg_stat_all_tables.last_analyze, @@ -1951,6 +1953,7 @@ pg_stat_user_tables| SELECT pg_stat_all_tables.relid, pg_stat_all_tables.n_live_tup, pg_stat_all_tables.n_dead_tup, pg_stat_all_tables.n_mod_since_analyze, + pg_stat_all_tables.vacuum_required, pg_stat_all_tables.last_vacuum, pg_stat_all_tables.last_autovacuum, pg_stat_all_tables.last_analyze, -- 2.9.2 From 176973d844c0965c4c7f89025b968790c886f6c0 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp> Date: Thu, 16 Nov 2017 15:27:53 +0900 Subject: [PATCH 1/4] Show index scans of the last vacuum in pg_stat_all_tables This number is already shown in the autovacuum completion log or the result of VACUUM VERBOSE, but the number is useful to see whether maintenance_work_mem is large enough so this patch adds the number in pg_stat_all_tables view. ---doc/src/sgml/config.sgml | 9 +++++++++doc/src/sgml/monitoring.sgml | 5 +++++src/backend/catalog/system_views.sql| 1 +src/backend/commands/vacuumlazy.c | 3 ++-src/backend/postmaster/pgstat.c | 6 +++++-src/backend/utils/adt/pgstatfuncs.c | 14 ++++++++++++++src/include/catalog/pg_proc.h | 2 ++src/include/pgstat.h | 5 ++++-src/test/regress/expected/rules.out | 3 +++9 files changed, 45 insertions(+), 3 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index fc1752f..b51d219 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -1502,6 +1502,15 @@ include_dir 'conf.d' too high. It may be useful to control for this by separately setting<xref linkend="guc-autovacuum-work-mem">. </para> + <para> + Vacuum scans all index pages to remove index entries that pointed to + dead tuples. Finishing vacuum with a minimal number of index scans + reduces the time it takes to complete it, and a new scan is triggered + once the in-memory storage for dead tuple pointers gets full, whose + size is defined by autovacuum_work_mem. So increasing this parameter + can make the operation finish more quickly. This can be monitored with + <xref linkend="pg-stat-all-tables-view">. + </para> </listitem> </varlistentry> diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 6f82033..6a57688 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -2570,6 +2570,11 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i daemon</entry> </row> <row> + <entry><structfield>last_vacuum_index_scans</structfield></entry> + <entry><type>integer</type></entry> + <entry>Number of splitted index scans performed during the last vacuum or autovacuum on this table</entry> + </row> + <row> <entry><structfield>vacuum_count</structfield></entry> <entry><type>bigint</type></entry> <entry>Numberof times this table has been manually vacuumed diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 394aea8..aeba9d5 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -527,6 +527,7 @@ CREATE VIEW pg_stat_all_tables AS pg_stat_get_last_autovacuum_time(C.oid) as last_autovacuum, pg_stat_get_last_analyze_time(C.oid) as last_analyze, pg_stat_get_last_autoanalyze_time(C.oid)as last_autoanalyze, + pg_stat_get_last_vacuum_index_scans(C.oid) AS last_vacuum_index_scans, pg_stat_get_vacuum_count(C.oid)AS vacuum_count, pg_stat_get_autovacuum_count(C.oid) AS autovacuum_count, pg_stat_get_analyze_count(C.oid) AS analyze_count, diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c index 6587db7..c482c8e 100644 --- a/src/backend/commands/vacuumlazy.c +++ b/src/backend/commands/vacuumlazy.c @@ -342,7 +342,8 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params, pgstat_report_vacuum(RelationGetRelid(onerel), onerel->rd_rel->relisshared, new_live_tuples, - vacrelstats->new_dead_tuples); + vacrelstats->new_dead_tuples, + vacrelstats->num_index_scans); pgstat_progress_end_command(); /* and log the action if appropriate*/ diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index 5c256ff..5f3fdf6 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -1403,7 +1403,8 @@ pgstat_report_autovac(Oid dboid) */voidpgstat_report_vacuum(Oid tableoid, bool shared, - PgStat_Counter livetuples, PgStat_Counter deadtuples) + PgStat_Counter livetuples, PgStat_Counter deadtuples, + PgStat_Counter num_index_scans){ PgStat_MsgVacuum msg; @@ -1417,6 +1418,7 @@ pgstat_report_vacuum(Oid tableoid, bool shared, msg.m_vacuumtime = GetCurrentTimestamp(); msg.m_live_tuples= livetuples; msg.m_dead_tuples = deadtuples; + msg.m_num_index_scans = num_index_scans; pgstat_send(&msg, sizeof(msg));} @@ -4585,6 +4587,7 @@ pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry, Oid tableoid, bool create) result->n_live_tuples= 0; result->n_dead_tuples = 0; result->changes_since_analyze = 0; + result->n_index_scans = 0; result->blocks_fetched = 0; result->blocks_hit = 0; result->vacuum_timestamp= 0; @@ -5981,6 +5984,7 @@ pgstat_recv_vacuum(PgStat_MsgVacuum *msg, int len) tabentry->n_live_tuples = msg->m_live_tuples; tabentry->n_dead_tuples = msg->m_dead_tuples; + tabentry->n_index_scans = msg->m_num_index_scans; if (msg->m_autovacuum) { diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c index 8d9e7c1..2956356 100644 --- a/src/backend/utils/adt/pgstatfuncs.c +++ b/src/backend/utils/adt/pgstatfuncs.c @@ -194,6 +194,20 @@ pg_stat_get_mod_since_analyze(PG_FUNCTION_ARGS) PG_RETURN_INT64(result);} +Datum +pg_stat_get_last_vacuum_index_scans(PG_FUNCTION_ARGS) +{ + Oid relid = PG_GETARG_OID(0); + int32 result; + PgStat_StatTabEntry *tabentry; + + if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL) + result = 0; + else + result = (int32) (tabentry->n_index_scans); + + PG_RETURN_INT32(result); +}Datumpg_stat_get_blocks_fetched(PG_FUNCTION_ARGS) diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index 0330c04..f3b606b 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -2887,6 +2887,8 @@ DATA(insert OID = 3317 ( pg_stat_get_wal_receiver PGNSP PGUID 12 1 0 0 0 f f fDESCR("statistics:information about WAL receiver");DATA(insert OID = 6118 ( pg_stat_get_subscription PGNSP PGUID 121 0 0 0 f f f f f f s r 1 0 2249 "26" "{26,26,26,23,3220,1184,1184,3220,1184}" "{i,o,o,o,o,o,o,o,o}" "{subid,subid,relid,pid,received_lsn,last_msg_send_time,last_msg_receipt_time,latest_end_lsn,latest_end_time}"_null_ _null_pg_stat_get_subscription _null_ _null_ _null_ ));DESCR("statistics: information about subscription"); +DATA(insert OID = 3281 ( pg_stat_get_last_vacuum_index_scans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 23 "26" _null__null_ _null_ _null_ _null_ pg_stat_get_last_vacuum_index_scans _null_ _null_ _null_ )); +DESCR("statistics: number of index scans in the last vacuum");DATA(insert OID = 2026 ( pg_backend_pid PGNSPPGUID 12 1 0 0 0 f f f f t f s r 0 0 23 "" _null_ _null_ _null_ _null_ _null_ pg_backend_pid _null_ _null_ _null_ ));DESCR("statistics:current backend PID");DATA(insert OID = 1937 ( pg_stat_get_backend_pid PGNSP PGUID 12 1 0 00 f f f f t f s r 1 0 23 "23" _null_ _null_ _null_ _null_ _null_ pg_stat_get_backend_pid _null_ _null_ _null_ )); diff --git a/src/include/pgstat.h b/src/include/pgstat.h index 089b7c3..3ab5f4a 100644 --- a/src/include/pgstat.h +++ b/src/include/pgstat.h @@ -369,6 +369,7 @@ typedef struct PgStat_MsgVacuum TimestampTz m_vacuumtime; PgStat_Counter m_live_tuples; PgStat_Counterm_dead_tuples; + PgStat_Counter m_num_index_scans;} PgStat_MsgVacuum; @@ -629,6 +630,7 @@ typedef struct PgStat_StatTabEntry PgStat_Counter n_live_tuples; PgStat_Counter n_dead_tuples; PgStat_Counter changes_since_analyze; + PgStat_Counter n_index_scans; PgStat_Counter blocks_fetched; PgStat_Counter blocks_hit; @@ -1165,7 +1167,8 @@ extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type textern void pgstat_report_autovac(Oiddboid);extern void pgstat_report_vacuum(Oid tableoid, bool shared, - PgStat_Counter livetuples, PgStat_Counter deadtuples); + PgStat_Counter livetuples, PgStat_Counter deadtuples, + PgStat_Counter num_index_scans);extern void pgstat_report_analyze(Relation rel, PgStat_Counter livetuples, PgStat_Counter deadtuples, bool resetcounter); diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out index f1c1b44..d0bb46c 100644 --- a/src/test/regress/expected/rules.out +++ b/src/test/regress/expected/rules.out @@ -1763,6 +1763,7 @@ pg_stat_all_tables| SELECT c.oid AS relid, pg_stat_get_last_autovacuum_time(c.oid) AS last_autovacuum, pg_stat_get_last_analyze_time(c.oid) AS last_analyze, pg_stat_get_last_autoanalyze_time(c.oid) AS last_autoanalyze, + pg_stat_get_last_vacuum_index_scans(c.oid) AS last_vacuum_index_scans, pg_stat_get_vacuum_count(c.oid) AS vacuum_count, pg_stat_get_autovacuum_count(c.oid) AS autovacuum_count, pg_stat_get_analyze_count(c.oid) AS analyze_count, @@ -1910,6 +1911,7 @@ pg_stat_sys_tables| SELECT pg_stat_all_tables.relid, pg_stat_all_tables.last_autovacuum, pg_stat_all_tables.last_analyze, pg_stat_all_tables.last_autoanalyze, + pg_stat_all_tables.last_vacuum_index_scans, pg_stat_all_tables.vacuum_count, pg_stat_all_tables.autovacuum_count, pg_stat_all_tables.analyze_count, @@ -1953,6 +1955,7 @@ pg_stat_user_tables| SELECT pg_stat_all_tables.relid, pg_stat_all_tables.last_autovacuum, pg_stat_all_tables.last_analyze, pg_stat_all_tables.last_autoanalyze, + pg_stat_all_tables.last_vacuum_index_scans, pg_stat_all_tables.vacuum_count, pg_stat_all_tables.autovacuum_count, pg_stat_all_tables.analyze_count, -- 2.9.2
pgsql-hackers by date: