Resumable vacuum proposal and design overview - Mailing list pgsql-hackers
From | Galy Lee |
---|---|
Subject | Resumable vacuum proposal and design overview |
Date | |
Msg-id | 45E2A6AE.1080805@oss.ntt.co.jp Whole thread Raw |
Responses |
Re: Resumable vacuum proposal and design overview
Re: Resumable vacuum proposal and design overview |
List | pgsql-hackers |
Hi We are developing a new feature for vacuum, here is a brief overview about it. Introduction ------------ A) What is it? This feature enables vacuum has resumable capability. Vacuum can remembers the point it stops, then resumes interrupted vacuum operation from the point next time. The SQL syntaxes for this feature has the format similar to: # VACUUM ALL --> It has the same functionality with current vacuum; # VACUUM --> It uses the information saved byprevious interrupted vacuum to continue vacuum processing. B) Why do we need it? For large table, although it can be vacuum by enabling vacuum cost-based delay, but the processing may last for several days. It definitely has negative affect on system performance. So if systems which has maintenance time, it is preferred to vacuum in maintenance window. Vacuum task can be split into small subtasks, and they can be scheduled into maintenance window time slot. This can reduce the impact of vacuum to system service. But currently vacuum task can not be split: if an interrupt or error occurs during vacuum processing, vacuum totally forgets what it has done and terminates itself. Following vacuum on the same table has to scan from the beginning of the heap block. This proposal enable vacuum has capability to stop and resume. C) How can we use it? This feature can enable autovacuum or cron-vacuum-scheduler to develop more sophisticated vacuum schedule schemes combined with *maintenance window*. For example, if the system has two hour maintenance window every day, vacuum can be interrupted by this way: SET statement_timeout TO 2*3600*1000; # two hours VACUUM freeze talname; SET statement_timeout TO DEFAULT; Or it can be interrupted by SIGINT directly. Autovacuum or manual vacuum scheduler can split a large vacuum task into small subtasks by this feature. Subtasks can be scheduled according to system load. Design Overview --------------- A) Vacuum internal overview Concurrent vacuum mainly has the following steps to vacuum a table: 1. scan heap to collect dead tuple list 2. (if the table has indexes) scan and sweep indexes 3. sweep dead tuples collectedin step 1 4. perform additional index cleanup operation 5. (if a certain of free space found) truncate table 6. register free space to FSM 7. update statistics of the table If maintenance memory is not sufficient to contain all of dead tuples, step 1-3 are repeated. C) Where to stop The first option is that it can accept stop request at boundary of each steps, and it resumes from the unfinished step. But some steps takes a short time, some take a long time. For example, step 1 takes more than 40% of total time, but step 5 and 6 take less than 1% of total time. This granularity is too larger. The second option is to accept stop request before vacuum begin to process one of the blocks in step 1-4. In current vacuum implementation, vacuum_delay_point is also placed in such locations. This option has a much good granularity than option 1. This implementation accepts stop request at *blocks level* in step 1-4. D) How to stop and resume - stop: When vacuum stop in step 1-4, vacuum perform following things: vacuum saves dead tuple list, the heap block number onwhich it stop, unswept index relations, unswept dead tuple and FreezeLimit into a disk file. Free space information collected in step 1-3 can be registered to FSM when vacuum is interrupted. - resume: When vacuum is resuming, it reads out saved information and skip the finished operations, then continue to finish remainingoperations. There are two additional issues which need to be discussed here: *) FreezeLimit. There are two options to selectFreezeLimit for a resuming a vacuum:(a) FreezeLimit of interrupted vacuum, (b) FreezeLimit of resumingvacuum. FreezeLimit-(b) is safe. But for the heap blocks are not full scanned, so when FreezeLimit-(b) isused , the relfrozenxid should be updated with FreezeLimit-(a) at the end of vacuum, and CLOG can only be truncatedby FreezeLimit-(a). *) Resuming index operation. There are two possible resuming levels when vacuum is interrupted in step 2:(a)skip the *index relations* which have been swept completely (b) skip the *index blocks* which have been swept. Level (a) is safe and simple to be implemented; level (b) need to consider the scenarios that leaf pageis split; further investigation is needed to clarify if it is safe or not. This implementation adopts *level (a) resuming*. 3. Implementation Plan ---------------------- We are working on the patch now; I will send the WIP patch to the list later. I am sorry this late proposal, but I hope it can go into 8.3. Welcome your comments and ideas. Best Regards Galy Lee (lee.galy@oss.ntt.co.jp) NTT Open Source Software Center
pgsql-hackers by date: