Thread: BUG #15840: Vacuum does not work after database stopped for wraparound protection. Database seems unrepearable.

BUG #15840: Vacuum does not work after database stopped for wraparound protection. Database seems unrepearable.

From

PG Bug reporting form

Date:

07 June 2019, 18:22:20

The following bug has been logged on the website:

Bug reference:      15840
Logged by:          Thierry Husson
Email address:      thusson@informiciel.com
PostgreSQL version: 12beta1
Operating system:   Ubuntu 18.04.2 LTS
Description:

I was doing tables COPY between my old server with PG10.8 and the new one
with 12Beta1. After each table is done, I make a vacuum of it.
However PG12 has stopped working for wraparound protection. I was doing it
on around 10 cpu, 1 table by cpu.
The is the end of the log of the copy program in Python3 with Psycopg2:
...
2019-06-06 23:15:26 prog_sync Vacuum
usr_ops.prg_hrdps_n1500n_voisin_ade_metar... 4s.
2019-06-06 23:15:30 prog_sync Vacuum
usr_ops.flt_hrdps_n1500n_voisin_ade_metar... 0s.
2019-06-06 23:15:30 prog_sync CPU 0 - Sync done 8857sec.
2019-06-06 23:15:30 prog_sync Tables Skipped:0,  Already sync:0,  Copied
from pravda:1.  Copied from zhen:0.
2019-06-06 23:15:30 prog_sync Sync done for 1 tables of 106451311 records in
8858s. (12018 rec./sec.)

Traceback (most recent call last):
  File "/home/semt700/emet/script/prog_sync.py", line 316, in syncTable
    ioResult = e.flushCopyBuffer(ioResult, curPG[slave][procId],
progTable[slave], columns)
  File "/fs/home/fs1/eccc/cmd/cmdn/semt700/emet/script/emetlib.py", line
607, in flushCopyBuffer
    cursorObj.copy_from(ioBuffer, tableName, sep='\t', columns=columnName,
null='NULL')
psycopg2.OperationalError: database is not accepting commands to avoid
wraparound data loss in database "emet_zhen"
HINT:  Stop the postmaster and vacuum that database in single-user mode.
You might also need to commit or roll back old prepared transactions, or
drop stale replication slots.
CONTEXT:  SQL statement "INSERT INTO
usr_ops.prg_gdps_g1610n_voisin_ade_synop_swob_metar_201903 SELECT $1.*"
PL/pgSQL function prog_insert() line 17 at EXECUTE
COPY prg_gdps_g1610n_voisin_ade_synop_swob_metar, line 132822: "284532738
   2019-03-20 00:00:00     2019-03-29 12:00:00     11011   37980000
-101750000      75597472        NULL    -5.4617 1       ..."

I did a DB shutdown and started a vacuum with:
postgres --single emet_zhen
VACUUM FREEZE VERBOSE;

It worked a few hours and when I was thinking it was done as nothing was
loggin anymore, I made a ctrl-\ and restarted the DB.
I was still getting wraparound protection messages so I shutdown the DB
again & redo the vacuum command but it didn't work anymore:
zhen:semt700 $ postgres --single emet_zhen
2019-06-07 17:23:36 UTC  7251 WARNING:  database with OID 16394 must be
vacuumed within 999995 transactions
2019-06-07 17:23:36 UTC  7251 HINT:  To avoid a database shutdown, execute a
database-wide VACUUM in that database.
        You might also need to commit or roll back old prepared
transactions, or drop stale replication slots.
PostgreSQL stand-alone backend 12beta1
backend> VACUUM VERBOSE;
2019-06-07 17:23:59 UTC  7251 WARNING:  database "emet_zhen" must be
vacuumed within 999995 transactions
2019-06-07 17:23:59 UTC  7251 HINT:  To avoid a database shutdown, execute a
database-wide VACUUM in that database.
        You might also need to commit or roll back old prepared
transactions, or drop stale replication slots.
2019-06-07 17:23:59 UTC  7251 LOG:  duration: 2417.639 ms  statement: VACUUM
VERBOSE;

I tried with various options but none worked. It also tried to restard the
DB and use vacuumdb --all -v , or various options, but always get the same
message for each table:

INFO:  aggressively vacuuming "pg_catalog.pg_publication"
INFO:  index "pg_publication_oid_index" now contains 0 row versions in 1
pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "pg_publication_pubname_index" now contains 0 row versions in 1
pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "pg_publication": found 0 removable, 0 nonremovable row versions in 0
out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin:
2146520116
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
WARNING:  database "emet_zhen" must be vacuumed within 999995 transactions
HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that
database.
You might also need to commit or roll back old prepared transactions, or
drop stale replication slots.

I out of clues of what to try next. I already got this situation with PG 9.x
& PG10.x but system wide in exclusive mode usualy worked.

Seems like a PG12 bug that will certainly prevent us from upgrading even if
the new fonctionnalities look really great.

Thanks a lot!

Thierry

Re: BUG #15840: Vacuum does not work after database stopped forwraparound protection. Database seems unrepearable.

From

Andres Freund

Date:

07 June 2019, 19:02:31

Hi,

On 2019-06-07 18:22:20 +0000, PG Bug reporting form wrote:
> I was doing tables COPY between my old server with PG10.8 and the new one
> with 12Beta1. After each table is done, I make a vacuum of it.
> However PG12 has stopped working for wraparound protection. I was doing it
> on around 10 cpu, 1 table by cpu.

That was a new postgres 12 cluster, not a pg_upgraded one? And you just
did a bunch of COPYs? How many?

I'm not clear as to how the cluster got to wraparound if that's the
scenario. We use one xid per transaction, and copy doesn't use multiple
transactions internally.  Any chance you have triggers on these tables
that use savepoints internally?


> postgres --single emet_zhen
> VACUUM FREEZE VERBOSE;

Don't FREEZE in wraparound cases, that just makes it take longer.


> It worked a few hours and when I was thinking it was done as nothing was
> loggin anymore, I made a ctrl-\ and restarted the DB.
> I was still getting wraparound protection messages so I shutdown the DB
> again & redo the vacuum command but it didn't work anymore:

> zhen:semt700 $ postgres --single emet_zhen
> 2019-06-07 17:23:36 UTC  7251 WARNING:  database with OID 16394 must be
> vacuumed within 999995 transactions
> 2019-06-07 17:23:36 UTC  7251 HINT:  To avoid a database shutdown, execute a
> database-wide VACUUM in that database.
>         You might also need to commit or roll back old prepared
> transactions, or drop stale replication slots.
> PostgreSQL stand-alone backend 12beta1
> backend> VACUUM VERBOSE;
> 2019-06-07 17:23:59 UTC  7251 WARNING:  database "emet_zhen" must be
> vacuumed within 999995 transactions
> 2019-06-07 17:23:59 UTC  7251 HINT:  To avoid a database shutdown, execute a
> database-wide VACUUM in that database.
>         You might also need to commit or roll back old prepared
> transactions, or drop stale replication slots.
> 2019-06-07 17:23:59 UTC  7251 LOG:  duration: 2417.639 ms  statement: VACUUM
> VERBOSE;

What do you mean by "didn't work anymore"? As far as I can tell the
VACUUM here succeeded?


> HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that
> database.
> You might also need to commit or roll back old prepared transactions, or
> drop stale replication slots.

Did you check whether any of these are the case?

SELECT * FROM pg_replication_slots;
SELECT * FROM pg_prepared_xacts;

Could you also show

SELECT oid, datname, datfrozenxid, age(datfrozenxid), datminmxid, mxid_age(datminmxid) FROM pg_database ORDER BY
age(datfrozenxid)DESC;
 
SELECT * FROM pg_control_checkpoint();

Greetings,

Andres Freund

Re: BUG #15840: Vacuum does not work after database stopped forwraparound protection. Database seems unrepearable.

From

Thierry Husson

Date:

07 June 2019, 19:59:11

Hi Andres,

Thank you for your anwser. Precisions bellow:

Andres Freund <andres@anarazel.de> a écrit :

> Hi,
>
> On 2019-06-07 18:22:20 +0000, PG Bug reporting form wrote:
>> I was doing tables COPY between my old server with PG10.8 and the new one
>> with 12Beta1. After each table is done, I make a vacuum of it.
>> However PG12 has stopped working for wraparound protection. I was doing it
>> on around 10 cpu, 1 table by cpu.
>
> That was a new postgres 12 cluster, not a pg_upgraded one? And you just
> did a bunch of COPYs? How many?
>
> I'm not clear as to how the cluster got to wraparound if that's the
> scenario. We use one xid per transaction, and copy doesn't use multiple
> transactions internally.  Any chance you have triggers on these tables
> that use savepoints internally?

Yes it was a new cluster. Around 30 copy were done.
Yes there is a trigger to manage partitions. Around 1200 tables were  
created. 10 billions records transfered, I need to tranfert 180BR over  
1700 tables.
I just realize I made vacuum on partitions for the first 8BR rows and  
forgot for the last 2BR That would explain the wraparound protection.

>
>
>> postgres --single emet_zhen
>> VACUUM FREEZE VERBOSE;
>
> Don't FREEZE in wraparound cases, that just makes it take longer.
>
>
>> It worked a few hours and when I was thinking it was done as nothing was
>> loggin anymore, I made a ctrl-\ and restarted the DB.
>> I was still getting wraparound protection messages so I shutdown the DB
>> again & redo the vacuum command but it didn't work anymore:
>
>> zhen:semt700 $ postgres --single emet_zhen
>> 2019-06-07 17:23:36 UTC  7251 WARNING:  database with OID 16394 must be
>> vacuumed within 999995 transactions
>> 2019-06-07 17:23:36 UTC  7251 HINT:  To avoid a database shutdown, execute a
>> database-wide VACUUM in that database.
>>         You might also need to commit or roll back old prepared
>> transactions, or drop stale replication slots.
>> PostgreSQL stand-alone backend 12beta1
>> backend> VACUUM VERBOSE;
>> 2019-06-07 17:23:59 UTC  7251 WARNING:  database "emet_zhen" must be
>> vacuumed within 999995 transactions
>> 2019-06-07 17:23:59 UTC  7251 HINT:  To avoid a database shutdown, execute a
>> database-wide VACUUM in that database.
>>         You might also need to commit or roll back old prepared
>> transactions, or drop stale replication slots.
>> 2019-06-07 17:23:59 UTC  7251 LOG:  duration: 2417.639 ms  statement: VACUUM
>> VERBOSE;
>
> What do you mean by "didn't work anymore"? As far as I can tell the
> VACUUM here succeeded?
>
>
>> HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that
>> database.
>> You might also need to commit or roll back old prepared transactions, or
>> drop stale replication slots.
>
> Did you check whether any of these are the case?
>
> SELECT * FROM pg_replication_slots;
> SELECT * FROM pg_prepared_xacts;
These are empty.

emet_zhen=# SELECT max(age(pg_database.datfrozenxid)) / 2147483648.0 *  
100.0 AS "Percentage of transaction ID's used" FROM pg_database;
  Percentage of transaction ID's used
-------------------------------------
             99.953434057533740997000

>
> Could you also show
>
> SELECT oid, datname, datfrozenxid, age(datfrozenxid), datminmxid,  
> mxid_age(datminmxid) FROM pg_database ORDER BY age(datfrozenxid) DESC;
   oid  |  datname  | datfrozenxid |    age     | datminmxid | mxid_age
-------+-----------+--------------+------------+------------+----------
  16394 | emet_zhen |        36464 | 2146483652 |          1 |        0
  12672 | template0 |    504982897 | 1641537219 |          1 |        0
  12673 | postgres  |   2096520116 |   50000000 |          1 |        0
      1 | template1 |   2096520116 |   50000000 |          1 |        0

> SELECT * FROM pg_control_checkpoint();
  checkpoint_lsn |   redo_lsn   |      redo_wal_file       |  
timeline_id | prev_timeline_id | full_page_writes |   next_xid   |  
next_oid | next_multixact_id | next_multi_offset | oldest_xid |  
oldest_xid_dbid | oldest_active_xid | oldest_multi_xid |  
oldest_multi_dbid | oldest_commit_ts_xid | newest_commit_ts_xid |     
checkpoint_time
  32D/54074EC0   | 32D/54074E88 | 000000010000032D00000054 |            
1 |                1 | t                | 0:2146520116 |   475782 |     
              1 |                 0 |      36464 |           16394 |    
      2146520116 |                1 |             16394 |               
       0 |                    0 | 2019-06-07 18:11:39+00
(1 row)

Could it be that PG12 considers "vacuum" as a transaction and trigger  
wraparound protection against it?


>
> Greetings,
>
> Andres Freund

Re: BUG #15840: Vacuum does not work after database stopped forwraparound protection. Database seems unrepearable.

From

Andres Freund

Date:

07 June 2019, 20:10:02

Hi,

On 2019-06-07 14:59:11 -0500, Thierry Husson wrote:
> Thank you for your anwser. Precisions bellow:
> Andres Freund <andres@anarazel.de> a écrit :
> > On 2019-06-07 18:22:20 +0000, PG Bug reporting form wrote:
> > > I was doing tables COPY between my old server with PG10.8 and the new one
> > > with 12Beta1. After each table is done, I make a vacuum of it.
> > > However PG12 has stopped working for wraparound protection. I was doing it
> > > on around 10 cpu, 1 table by cpu.
> > 
> > That was a new postgres 12 cluster, not a pg_upgraded one? And you just
> > did a bunch of COPYs? How many?
> > 
> > I'm not clear as to how the cluster got to wraparound if that's the
> > scenario. We use one xid per transaction, and copy doesn't use multiple
> > transactions internally.  Any chance you have triggers on these tables
> > that use savepoints internally?
> 
> Yes it was a new cluster. Around 30 copy were done.
> Yes there is a trigger to manage partitions. Around 1200 tables were
> created. 10 billions records transfered, I need to tranfert 180BR over 1700
> tables.
> I just realize I made vacuum on partitions for the first 8BR rows and forgot
> for the last 2BR That would explain the wraparound protection.

Do those triggers use savepoints / EXCEPTION handling?

Might be worthwhile to check - independent of this issue - if you still
need the partition handling via trigger, now that pg's builtin
partitioning can handle COPY (and likely *much* faster).


> > Could you also show
> > 
> > SELECT oid, datname, datfrozenxid, age(datfrozenxid), datminmxid,
> > mxid_age(datminmxid) FROM pg_database ORDER BY age(datfrozenxid) DESC;
>   oid  |  datname  | datfrozenxid |    age     | datminmxid | mxid_age
> -------+-----------+--------------+------------+------------+----------
>  16394 | emet_zhen |        36464 | 2146483652 |          1 |        0

Ok, so it's xids, and clearly not multixids.  Could you connect to
emet_zhen and show the output of:

SELECT oid, oid::regclass, relkind, relfrozenxid, age(relfrozenxid) FROM pg_class WHERE relfrozenxid <> 0 AND
age(relfrozenxid)> 1800000000 ORDER BY age(relfrozenxid) DESC;
 

that will tell us which relations need to be vacuumed, and then we can
see why that doesn't work.


> Could it be that PG12 considers "vacuum" as a transaction and trigger
> wraparound protection against it?

I'm still somewhat confused - the output you showed didn't include
vacuum failing, as far as I can tell?

- Andres

Re: BUG #15840: Vacuum does not work after database stopped forwraparound protection. Database seems unrepearable.

From

Thierry Husson

Date:

07 June 2019, 21:40:27

Thanks again Andres,

Andres Freund <andres@anarazel.de> a écrit :

> Hi,
>
> On 2019-06-07 14:59:11 -0500, Thierry Husson wrote:
>> Thank you for your anwser. Precisions bellow:
>> Andres Freund <andres@anarazel.de> a écrit :
>> > On 2019-06-07 18:22:20 +0000, PG Bug reporting form wrote:
>> > > I was doing tables COPY between my old server with PG10.8 and  
>> the new one
>> > > with 12Beta1. After each table is done, I make a vacuum of it.
>> > > However PG12 has stopped working for wraparound protection. I  
>> was doing it
>> > > on around 10 cpu, 1 table by cpu.
>> >
>> > That was a new postgres 12 cluster, not a pg_upgraded one? And you just
>> > did a bunch of COPYs? How many?
>> >
>> > I'm not clear as to how the cluster got to wraparound if that's the
>> > scenario. We use one xid per transaction, and copy doesn't use multiple
>> > transactions internally.  Any chance you have triggers on these tables
>> > that use savepoints internally?
>>
>> Yes it was a new cluster. Around 30 copy were done.
>> Yes there is a trigger to manage partitions. Around 1200 tables were
>> created. 10 billions records transfered, I need to tranfert 180BR over 1700
>> tables.
>> I just realize I made vacuum on partitions for the first 8BR rows and forgot
>> for the last 2BR That would explain the wraparound protection.
>
> Do those triggers use savepoints / EXCEPTION handling?
>
> Might be worthwhile to check - independent of this issue - if you still
> need the partition handling via trigger, now that pg's builtin
> partitioning can handle COPY (and likely *much* faster).

Yes, those triggers use exception handling (if partition doesn't  
exist, create it) but no savepoint.
Thanks for the suggestion, I take that in note!

>> > Could you also show
>> >
>> > SELECT oid, datname, datfrozenxid, age(datfrozenxid), datminmxid,
>> > mxid_age(datminmxid) FROM pg_database ORDER BY age(datfrozenxid) DESC;
>>   oid  |  datname  | datfrozenxid |    age     | datminmxid | mxid_age
>> -------+-----------+--------------+------------+------------+----------
>>  16394 | emet_zhen |        36464 | 2146483652 |          1 |        0
>
> Ok, so it's xids, and clearly not multixids.  Could you connect to
> emet_zhen and show the output of:
>
> SELECT oid, oid::regclass, relkind, relfrozenxid, age(relfrozenxid)  
> FROM pg_class WHERE relfrozenxid <> 0 AND age(relfrozenxid) >  
> 1800000000 ORDER BY age(relfrozenxid) DESC;
> that will tell us which relations need to be vacuumed, and then we can
> see why that doesn't work.
>> Could it be that PG12 considers "vacuum" as a transaction and trigger
>> wraparound protection against it?
>
> I'm still somewhat confused - the output you showed didn't include
> vacuum failing, as far as I can tell?
>
> - Andres

   oid   |                 oid                  | relkind |  
relfrozenxid |    age
--------+--------------------------------------+---------+--------------+------------
  460564 | pg_temp_3.cur_semt700_progsync_4996  | r       |         
36464 | 2146483652
  460764 | pg_temp_8.cur_semt700_progsync_5568  | r       |      
19836544 | 2126683572
  460718 | pg_temp_4.cur_semt700_progsync_5564  | r       |      
19836544 | 2126683572
  460721 | pg_temp_5.cur_semt700_progsync_5565  | r       |      
19836544 | 2126683572
  461068 | pg_temp_22.cur_semt700_progsync_5581 | r       |      
19836544 | 2126683572

These are temporary tables to manage concurrency & server load. It  
seems the sudden disconnection due to wraparound protection didn't get  
them removed. I removed them manually under single mode and there is  
no more warning now, vacuum command included. Your command is very  
interesting to know.

It annoying PG create a xId for empty temporary tables. You can't  
clear it with a vacuum as there is no record. I have to terminate  
connexions of my deamon processes daily to avoid wraparound  
protection. Is there a way to tell PG to forget these tables on its  
age estimation?

Thank you so much Andres! You saved me!

Thierry

Re: BUG #15840: Vacuum does not work after database stopped forwraparound protection. Database seems unrepearable.

From

Andres Freund

Date:

07 June 2019, 21:47:47

Hi,

On 2019-06-07 16:40:27 -0500, Thierry Husson wrote:
>   oid   |                 oid                  | relkind | relfrozenxid |
> age
> --------+--------------------------------------+---------+--------------+------------
>  460564 | pg_temp_3.cur_semt700_progsync_4996  | r       |        36464 |
> 2146483652
>  460764 | pg_temp_8.cur_semt700_progsync_5568  | r       |     19836544 |
> 2126683572
>  460718 | pg_temp_4.cur_semt700_progsync_5564  | r       |     19836544 |
> 2126683572
>  460721 | pg_temp_5.cur_semt700_progsync_5565  | r       |     19836544 |
> 2126683572
>  461068 | pg_temp_22.cur_semt700_progsync_5581 | r       |     19836544 |
> 2126683572
> 
> These are temporary tables to manage concurrency & server load. It seems the
> sudden disconnection due to wraparound protection didn't get them removed. I
> removed them manually under single mode and there is no more warning now,
> vacuum command included. Your command is very interesting to know.

Hm. But you do have autovacuum enabled, is that right?  If enabled, have
you tuned it at all? It seems quite possible that given your load (10
parallel loads), the default settings werent aggressive enough.


> It annoying PG create a xId for empty temporary tables. You can't clear it
> with a vacuum as there is no record. I have to terminate connexions of my
> deamon processes daily to avoid wraparound protection. Is there a way to
> tell PG to forget these tables on its age estimation?

Normally postgres would drop such "orphaned" temp tables on its own, in
autovacuum (triggering it when close to a wraparound, even if
disabled). But if it can't keep up for some reason, then that's not
necessarily good enough with very rapid xid usage as you seem to have.

I'll start a thread about this subtopic on -hackers.

Greetings,

Andres Freund

Re: BUG #15840: Vacuum does not work after database stopped forwraparound protection. Database seems unrepearable.

From

Thierry Husson

Date:

07 June 2019, 22:49:52

Hi Andres,

Andres Freund <andres@anarazel.de> a écrit :

> Hi,
>
> On 2019-06-07 16:40:27 -0500, Thierry Husson wrote:
>>   oid   |                 oid                  | relkind | relfrozenxid |
>> age
>> --------+--------------------------------------+---------+--------------+------------
>>  460564 | pg_temp_3.cur_semt700_progsync_4996  | r       |        36464 |
>> 2146483652
>>  460764 | pg_temp_8.cur_semt700_progsync_5568  | r       |     19836544 |
>> 2126683572
>>  460718 | pg_temp_4.cur_semt700_progsync_5564  | r       |     19836544 |
>> 2126683572
>>  460721 | pg_temp_5.cur_semt700_progsync_5565  | r       |     19836544 |
>> 2126683572
>>  461068 | pg_temp_22.cur_semt700_progsync_5581 | r       |     19836544 |
>> 2126683572
>>
>> These are temporary tables to manage concurrency & server load. It seems the
>> sudden disconnection due to wraparound protection didn't get them removed. I
>> removed them manually under single mode and there is no more warning now,
>> vacuum command included. Your command is very interesting to know.
>
> Hm. But you do have autovacuum enabled, is that right?  If enabled, have
> you tuned it at all? It seems quite possible that given your load (10
> parallel loads), the default settings werent aggressive enough.

Yes autovacuum is enabled. Aggressiveness was effectively a recent  
problem I had and putting its max_worker to 8 wasn't a solution, there  
were all busy 24/7 and I had to do a daily script to help it. The  
solution was to push vacuum_cost_limit to 2000, since then it works  
like a charm. Another issue was autovaccuums were taking the lock over  
my running vacuums, making them waiting for 5 days instead of taking  
around 1 hour. I could do another post on that but it's not PG12  
specific, I have it with 10.x

>> It annoying PG create a xId for empty temporary tables. You can't clear it
>> with a vacuum as there is no record. I have to terminate connexions of my
>> deamon processes daily to avoid wraparound protection. Is there a way to
>> tell PG to forget these tables on its age estimation?
>
> Normally postgres would drop such "orphaned" temp tables on its own, in
> autovacuum (triggering it when close to a wraparound, even if
> disabled). But if it can't keep up for some reason, then that's not
> necessarily good enough with very rapid xid usage as you seem to have.
>
> I'll start a thread about this subtopic on -hackers.
> Greetings,
>
> Andres Freund

What is the link to this forum? I'm very very interested to follow  
that subtopic & I could make some tests if necessary.

Have a great weekend & thanks for your time :)

Thierry

Re: BUG #15840: Vacuum does not work after database stopped forwraparound protection. Database seems unrepearable.

From

Andres Freund

Date:

07 June 2019, 23:01:03

Hi,

On 2019-06-07 17:49:52 -0500, Thierry Husson wrote:
> Andres Freund <andres@anarazel.de> a écrit :
> > I'll start a thread about this subtopic on -hackers.

> What is the link to this forum? I'm very very interested to follow that
> subtopic & I could make some tests if necessary.

It's now (was interrupted by something else) at:
https://postgr.es/m/20190607225843.z73jqqyy6hhc6qnp%40alap3.anarazel.de

and you're CCed in the discussion.

Have a nice weekend as well!

Andres

Re: BUG #15840: Vacuum does not work after database stopped forwraparound protection. Database seems unrepearable.

From

Alvaro Herrera

Date:

08 June 2019, 00:25:42

On 2019-Jun-07, Thierry Husson wrote:

> Yes autovacuum is enabled. Aggressiveness was effectively a recent problem I
> had and putting its max_worker to 8 wasn't a solution, there were all busy
> 24/7 and I had to do a daily script to help it. The solution was to push
> vacuum_cost_limit to 2000, since then it works like a charm.

Note the I/O cost balancing thing, which seems to bite many people: if
you raise max_workers without changing cost_delay or cost_limit, it
doesn't have much of an effect, because each worker goes slower to
accomodate.  Raising the cost limit (or lowering the cost delay) does
have a useful impact.  In pg12 we changed the default cost_delay to 2ms
(from 20ms).

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services