Re: [GENERAL] Vacuum and state_change - Mailing list pgsql-general
From | armand pirvu |
---|---|
Subject | Re: [GENERAL] Vacuum and state_change |
Date | |
Msg-id | DAA7660F-3EB1-4E65-9710-08AB933AE1B5@gmail.com Whole thread Raw |
In response to | Re: [GENERAL] Vacuum and state_change (Adrian Klaver <adrian.klaver@aklaver.com>) |
Responses |
Re: [GENERAL] Vacuum and state_change
|
List | pgsql-general |
On Jun 9, 2017, at 11:23 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:On 06/09/2017 09:13 AM, armand pirvu wrote:On Jun 9, 2017, at 11:01 AM, Adrian Klaver <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:No and that was what got me worried
On 06/09/2017 08:45 AM, armand pirvu wrote:Hi
Had a couple of processes blocking the vacuum so I terminated them using
select pg_terminate_backend(pid);
Running the following
select distinct pid, backend_start, query_start, state_change, state, query from pg_catalog.pg_stat_activity order by 1;
pid | backend_start | query_start | state_change | state | query
-------+-------------------------------+-------------------------------+-------------------------------+--------+--------------------------------------------------------------------------------------------------------------------------
10677 | 2017-06-09 10:25:49.189848-05 | 2017-06-09 10:33:43.598805-05 | 2017-06-09 10:33:43.599091-05 | idle | SELECT 1
11096 | 2017-06-09 10:27:03.686588-05 | 2017-06-09 10:33:56.28736-05 | 2017-06-09 10:33:56.287364-05 | active | select distinct pid, backend_start, query_start, state_change, state, query from pg_catalog.pg_stat_activity order by 1;
13277 | 2017-06-09 07:48:49.506686-05 | 2017-06-09 07:48:52.887185-05 | 2017-06-09 07:48:52.887188-05 | active | autovacuum: VACUUM csischema.tf_purchased_badge
13484 | 2017-06-09 10:31:54.127672-05 | 2017-06-09 10:33:47.137938-05 | 2017-06-09 10:33:47.138226-05 | idle | SELECT 1
16886 | 2017-06-09 07:56:49.033893-05 | 2017-06-09 07:56:49.078369-05 | 2017-06-09 07:56:49.078371-05 | active | autovacuum: VACUUM csischema.tf_purchases_person
25387 | 2017-06-09 05:32:08.079397-05 | 2017-06-09 05:32:08.385728-05 | 2017-06-09 05:32:08.385731-05 | active | autovacuum: VACUUM csischema.tf_demographic_response_person
37465 | 2017-06-09 08:50:58.992002-05 | 2017-06-09 08:51:21.506829-05 | 2017-06-09 08:51:21.506831-05 | active | autovacuum: VACUUM csischema.tf_transaction_item_person
I did notice that state_change did not change one bit
Did the state change?
Are these large tables?
I would say yes
select count(*) from csischema.tf_purchased_badge;
9380749
select count(*) from csischema.tf_purchases_person;
19902172
select count(*) from csischema.tf_demographic_response_person;
80868561
select count(*) from csischema.tf_transaction_item_person;
3281084
Interesting enough two completed
relname | seq_scan | seq_tup_read | idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_live_tup | n_dead_tup | n_mod_since_analyze | last_vacuum | last_autovacuum | autovacuum_count
----------------------------+----------+--------------+----------+---------------+-----------+-----------+------------+------------+---------------------+-------------+-------------------------------+------------------
tf_transaction_item_person | 160 | 0 | 476810 | 1946119 | 2526 | 473678 | 3226110 | 0 | 116097 | | 2017-06-09 11:15:24.701997-05 | 2
tf_purchased_badge | 358 | 1551142438 | 2108331 | 7020502 | 5498 | 1243746 | 9747336 | 107560 | 115888 | | 2017-06-09 15:09:16.624363-05 | 1
2017-06-09 14:18:38.552 CDT,,,888,,593a1810.378,271,,2017-06-08 22:37:52 CDT,,0,LOG,00000,"checkpoint starting: time",,,,,,,,,""
2017-06-09 14:21:12.210 CDT,,,888,,593a1810.378,272,,2017-06-08 22:37:52 CDT,,0,LOG,00000,"checkpoint complete: wrote 12070 buffers (9.2%); 0 transaction log file(s) added, 0 removed, 4 recycled; write=148.714 s, sync=4.834 s, total=153.657 s; sync files=17, longest=1.841 s, average=0.284 s; distance=89452 kB, estimate=89452 kB",,,,,,,,,""
2017-06-09 14:23:38.278 CDT,,,888,,593a1810.378,273,,2017-06-08 22:37:52 CDT,,0,LOG,00000,"checkpoint starting: time",,,,,,,,,""
2017-06-09 14:24:38.629 CDT,,,888,,593a1810.378,274,,2017-06-08 22:37:52 CDT,,0,LOG,00000,"checkpoint complete: wrote 593 buffers (0.5%); 0 transaction log file(s) added, 0 removed, 1 recycled; write=59.825 s, sync=0.474 s, total=60.350 s; sync files=8, longest=0.355 s, average=0.059 s; distance=26952 kB, estimate=83202 kB",,,,,,,,,""
And also
SELECT
total_checkpoints,
seconds_since_start / total_checkpoints / 60 AS minutes_between_checkpoints
FROM
(SELECT
EXTRACT(EPOCH FROM (now() - pg_postmaster_start_time())) AS seconds_since_start,
(checkpoints_timed+checkpoints_req) AS total_checkpoints
FROM pg_stat_bgwriter
) AS sub;
total_checkpoints | minutes_between_checkpoints
-------------------+-----------------------------
285 | 3.33731205871345
These tables suffer quite some data changes IIRC but that comes via some temp tables which reside in a temp schema and some previous messages from the log suggest that it might have ran into ladder locking in early stages, aka tmp table locked from vacuum and any further processing waiting for it and causing some other waits on those largish tables
Considering the temp ones are only for load and yes some processing goes in there , I am thinking disabling auto vacuum for the temp tables . Or should I disable auto vacuum all together and run say as a bath job on a weekend night ?
Aside that there are vacuum improvements and such, any other strong compelling reason to upgrade to 9.6 ?
Does that mean that something is not quite right with the vacuums ?
Might want to take a look at:
https://www.postgresql.org/docs/9.6/static/monitoring-stats.html#PG-STAT-ALL-TABLES-VIEWThank you
Armand
--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
--
Adrian Klaver
adrian.klaver@aklaver.com
pgsql-general by date: