Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: [HACKERS] Block level parallel vacuum |
Date | |
Msg-id | CAD21AoAax9_xdTspYhYwCDJMJj_fMHuAQ1rwAM7P-bfyBB40Ng@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Block level parallel vacuum (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: [HACKERS] Block level parallel vacuum
Re: [HACKERS] Block level parallel vacuum |
List | pgsql-hackers |
On Tue, Aug 14, 2018 at 9:31 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Thu, Nov 30, 2017 at 11:09 AM, Michael Paquier > <michael.paquier@gmail.com> wrote: > > On Tue, Oct 24, 2017 at 5:54 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > >> Yeah, I was thinking the commit is relevant with this issue but as > >> Amit mentioned this error is emitted by DROP SCHEMA CASCASE. > >> I don't find out the cause of this issue yet. With the previous > >> version patch, autovacuum workers were woking with one parallel worker > >> but it never drops relations. So it's possible that the error might > >> not have been relevant with the patch but anywayI'll continue to work > >> on that. > > > > This depends on the extension lock patch from > > https://www.postgresql.org/message-id/flat/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=Gn1LA@mail.gmail.com/ > > if I am following correctly. So I propose to mark this patch as > > returned with feedback for now, and come back to it once the root > > problems are addressed. Feel free to correct me if you think that's > > not adapted. > > I've re-designed the parallel vacuum patch. Attached the latest > version patch. As the discussion so far, this patch depends on the > extension lock patch[1]. However I think we can discuss the design > part of parallel vacuum independently from that patch. That's way I'm > proposing the new patch. In this patch, I structured and refined the > lazy_scan_heap() because it's a single big function and not suitable > for making it parallel. > > The parallel vacuum worker processes keep waiting for commands from > the parallel vacuum leader process. Before entering each phase of lazy > vacuum such as scanning heap, vacuum index and vacuum heap, the leader > process changes the all workers state to the next state. Vacuum worker > processes do the job according to the their state and wait for the > next command after finished. Also in before entering the next phase, > the leader process does some preparation works while vacuum workers is > sleeping; for example, clearing shared dead tuple space before > entering the 'scanning heap' phase. The status of vacuum workers are > stored into a DSM area pointed by WorkerState variables, and > controlled by the leader process. FOr the basic design and performance > improvements please refer to my presentation at PGCon 2018[2]. > > The number of parallel vacuum workers is determined according to > either the table size or PARALLEL option in VACUUM command. The > maximum of parallel workers is max_parallel_maintenance_workers. > > I've separated the code for vacuum worker process to > backends/commands/vacuumworker.c, and created > includes/commands/vacuum_internal.h file to declare the definitions > for the lazy vacuum. > > For autovacuum, this patch allows autovacuum worker process to use the > parallel option according to the relation size or the reloption. But > autovacuum delay, since there is no slots for parallel worker of > autovacuum in AutoVacuumShmem this patch doesn't support the change of > the autovacuum delay configuration during running. > Attached rebased version patch to the current HEAD. > Please apply this patch with the extension lock patch[1] when testing > as this patch can try to extend visibility map pages concurrently. > Because the patch leads performance degradation in the case where bulk-loading to a partitioned table I think that the original proposal, which makes group locking conflict when relation extension locks, is more realistic approach. So I worked on this with the simple patch instead of [1]. Attached three patches: * 0001 patch publishes some static functions such as heap_paralellscan_startblock_init so that the parallel vacuum code can use them. * 0002 patch makes the group locking conflict when relation extension locks. * 0003 patch add paralel option to lazy vacuum. Please review them. [1] https://www.postgresql.org/message-id/CAD21AoBn8WbOt21MFfj1mQmL2ZD8KVgMHYrOe1F5ozsQC4Z_hw%40mail.gmail.com Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
pgsql-hackers by date: