Thread: thoughts on "prevent wraparound" vacuum
Hello.
Currently I am working a lot with cluster consist a few of big tables. About 2-3 TB. These tables are heavily updated, some rows are removed, new rows are inserted... Kind of typical OLTP workload.
Physical table size keeps mostly stable while regular VACUUM is working. It is fast enough to clean some place from removed rows.
But time to time "to prevent wraparound" comes. And it works like 8-9 days. During that time relation size starting to expand quickly. Freezing all blocks in such table takes a lot of time and bloat is generated much more quickly.
Of course after aggressive vacuum finishes table are not shrink back (some kind of repacking required). And even after repacking - relation shrinking causes all cluster to stuck for some time (due exclusive locking, see (1)).
So, I was thinking about it and I saw two possible solutions:
1. Track two block pointers for aggressive vacuum. One is to freeze all blocks and other is to perform regular vacuum on non-all-visible blocks. Second one is circular (could process table multiple times while first one is moving from start to end of the table). And some parameters to spread resources between pointers is required.
2. Separate "to prevent wraparound" from regular Vacuum to allow them run concurrently. But it seems to be much more work here.
Could you please share some thoughts on it? Is it worth to be implemented?
Currently I am working a lot with cluster consist a few of big tables. About 2-3 TB. These tables are heavily updated, some rows are removed, new rows are inserted... Kind of typical OLTP workload.
Physical table size keeps mostly stable while regular VACUUM is working. It is fast enough to clean some place from removed rows.
But time to time "to prevent wraparound" comes. And it works like 8-9 days. During that time relation size starting to expand quickly. Freezing all blocks in such table takes a lot of time and bloat is generated much more quickly.
Of course after aggressive vacuum finishes table are not shrink back (some kind of repacking required). And even after repacking - relation shrinking causes all cluster to stuck for some time (due exclusive locking, see (1)).
So, I was thinking about it and I saw two possible solutions:
1. Track two block pointers for aggressive vacuum. One is to freeze all blocks and other is to perform regular vacuum on non-all-visible blocks. Second one is circular (could process table multiple times while first one is moving from start to end of the table). And some parameters to spread resources between pointers is required.
2. Separate "to prevent wraparound" from regular Vacuum to allow them run concurrently. But it seems to be much more work here.
Could you please share some thoughts on it? Is it worth to be implemented?
Thanks.
Hi, On 2019-07-20 15:35:57 +0300, Michail Nikolaev wrote: > Currently I am working a lot with cluster consist a few of big tables. > About 2-3 TB. These tables are heavily updated, some rows are removed, new > rows are inserted... Kind of typical OLTP workload. > > Physical table size keeps mostly stable while regular VACUUM is working. It > is fast enough to clean some place from removed rows. > > But time to time "to prevent wraparound" comes. And it works like 8-9 days. > During that time relation size starting to expand quickly. Freezing all > blocks in such table takes a lot of time and bloat is generated much more > quickly. Several questions: - Which version of postgres is this? Newer versions avoid scanning unchanged parts of the heap even for freezing (9.6+, with additional smaller improvements in 11). - have you increased the vacuum cost limits? Before PG 12 they're so low they're entirely unsuitable for larger databases, and even in 12 you should likely increase them for a multi-TB database Unfortunately even if those are fixed the indexes are still likely going to be scanned in their entirety - but most of the time not modified much, so that's not as bad. Greetings, Andres Freund
Hello.
>- Which version of postgres is this? Newer versions avoid scanning
> unchanged parts of the heap even for freezing (9.6+, with additional
> smaller improvements in 11).
Oh, totally forgot about version and settings...
server_version 10.9 (Ubuntu 10.9-103)
So, "don't vacuum all-frozen pages" included.
> - have you increased the vacuum cost limits? Before PG 12 they're so low
> they're entirely unsuitable for larger databases, and even in 12 you
> should likely increase them for a multi-TB database
Current settings are:
autovacuum_max_workers 8
autovacuum_vacuum_cost_delay 5ms
autovacuum_vacuum_cost_limit 400
autovacuum_work_mem -1
vacuum_cost_page_dirty 40
vacuum_cost_page_hit 1
vacuum_cost_page_miss 10
"autovacuum_max_workers" set to 8 because server needs to process a lot of changing relations.
Settings were more aggressive previously (autovacuum_vacuum_cost_limit was 2800) but it leads to very high IO load causing issues with application performance and stability (even on SSD).
"vacuum_cost_page_dirty" was set to 40 few month ago. High IO write peaks were causing application requests to stuck into WALWriteLock.
After some investigations we found it was caused by WAL-logging peaks.
Such WAL-peaks are mostly consist of such records:
Type N(%) Record size (%) FPI size (%) Combined size (%)
------
Heap2/CLEAN 10520 ( 0.86) 623660 ( 0.21) 5317532 ( 0.53) 5941192 ( 0.46)
Heap2/FREEZE_PAGE 113419 ( 9.29) 6673877 ( 2.26) 635354048 ( 63.12) 642027925 ( 49.31)
another example:
Heap2/CLEAN 196707 ( 6.96) 12116527 ( 1.56) 292317231 ( 37.77) 304433758 ( 19.64)
Heap2/FREEZE_PAGE 1819 ( 0.06) 104012 ( 0.01) 13324269 ( 1.72) 13428281 ( 0.87)
Thanks,
Michail.
>- Which version of postgres is this? Newer versions avoid scanning
> unchanged parts of the heap even for freezing (9.6+, with additional
> smaller improvements in 11).
Oh, totally forgot about version and settings...
server_version 10.9 (Ubuntu 10.9-103)
So, "don't vacuum all-frozen pages" included.
> - have you increased the vacuum cost limits? Before PG 12 they're so low
> they're entirely unsuitable for larger databases, and even in 12 you
> should likely increase them for a multi-TB database
Current settings are:
autovacuum_max_workers 8
autovacuum_vacuum_cost_delay 5ms
autovacuum_vacuum_cost_limit 400
autovacuum_work_mem -1
vacuum_cost_page_dirty 40
vacuum_cost_page_hit 1
vacuum_cost_page_miss 10
"autovacuum_max_workers" set to 8 because server needs to process a lot of changing relations.
Settings were more aggressive previously (autovacuum_vacuum_cost_limit was 2800) but it leads to very high IO load causing issues with application performance and stability (even on SSD).
"vacuum_cost_page_dirty" was set to 40 few month ago. High IO write peaks were causing application requests to stuck into WALWriteLock.
After some investigations we found it was caused by WAL-logging peaks.
Such WAL-peaks are mostly consist of such records:
Type N(%) Record size (%) FPI size (%) Combined size (%)
------
Heap2/CLEAN 10520 ( 0.86) 623660 ( 0.21) 5317532 ( 0.53) 5941192 ( 0.46)
Heap2/FREEZE_PAGE 113419 ( 9.29) 6673877 ( 2.26) 635354048 ( 63.12) 642027925 ( 49.31)
another example:
Heap2/CLEAN 196707 ( 6.96) 12116527 ( 1.56) 292317231 ( 37.77) 304433758 ( 19.64)
Heap2/FREEZE_PAGE 1819 ( 0.06) 104012 ( 0.01) 13324269 ( 1.72) 13428281 ( 0.87)
Thanks,
Michail.