VACUUM Improvements - WIP Patch - Mailing list pgsql-patches
From | Pavan Deolasee |
---|---|
Subject | VACUUM Improvements - WIP Patch |
Date | |
Msg-id | 2e78013d0806092232h6ca15ffejcbcd24e88401308f@mail.gmail.com Whole thread Raw |
Responses |
Re: VACUUM Improvements - WIP Patch
Re: VACUUM Improvements - WIP Patch |
List | pgsql-patches |
Here is a WIP patch based on the discussions here: http://archives.postgresql.org/pgsql-hackers/2008-05/msg00863.php The attached WIP patch improves the LAZY VACUUM by limiting or avoiding the second heap scan. This not only saves considerable time in VACUUM, but also reduces the double-writes of vacuumed blocks. If the second heap scan is considerably limited, that should also save CPU usage and reduce WAL log writing. With HOT, the first heap scan prunes and defrags every page in the heap. That truncates all the dead tuples to their DEAD line pointers and releases all the free space in the page. The second scan only removes these DEAD line pointers and records the free space in the FSM. The free space in fact does not change from the first pass. But to do so, it again calls RepairPageFragmentation on each page, dirties the page and calls log_heap_clean() again on the page. This clearly looks like too much work for a small gain. As this patch stands, the first phase of vacuum prunes the heap pages as usual. But it marks the DEAD line pointers as DEAD_RECLAIMED to signal that the index pointers to these line pointers are being removed, if certain conditions are satisfied. Other backend when prunes a page, also reclaims DEAD_RECLAIMED line pointers by marking them UNUSED. We need some additional logic to do this in a safe way: - An additional boolean pg_class attribute (relvac_inprogress) is used to track the status of vacuum on a relation. If the attribute is true, either vacuum is in progress on the relation or the last vacuum did not complete successfully. When VACUUM starts, it sets relvac_inprogress to true. The transaction is committed and a new transaction is started so that all other backends can see the change. We also note down the transactions which may already have the table open. VACUUM then starts the first heap scan. It prunes the page, but it can start marking the DEAD line pointers as DEAD_RECLAIMED only after it knows that all other backends can see that VACUUM is in progress on the target relation. Otherwise there is a danger that backends might reclaim DEAD line pointers before their index pointers are removed and that would lead to index corruption. We do that by periodic conditional waits on the noted transactions ids. Once all old transactions are gone, VACUUM sets the second scan limit to the current block number and starts marking subsequent DEAD line pointers as DEAD_RECLAIMED. In most of the cases where the old transactions quickly go away, and for large tables, the second scan will be very limited. In the worst case, we might incur the overhead of conditional waits without any success. TODO: - We can potentially update FSM at the end of first pass. This is not a significant issue if the second scan is very limited. But if we do this, we need to handle the truncate case properly. - As the patch stands, we check of old transactions at every block iteration. This might not be acceptable for the cases where there are long running transactions. We probably need some exponential gap here. - As the patch stands, the heap_page_prune handles reclaiming the DEAD_RECLAIMED line pointers since it already has ability to WAL log similar changes. We don't do any extra work to trigger pruning though (except than setting page_prune_xid). May be we should trigger pruning if we got a line pointer bloat in a page too. Please let me know comments/suggestions and any other improvements. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Attachment
pgsql-patches by date: