Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers
From | Mahendra Singh |
---|---|
Subject | Re: [HACKERS] Block level parallel vacuum |
Date | |
Msg-id | CAKYtNAqGeA5sT-_2U5+osza41Co=rzNY_ejNacVmZDMd7cQLqg@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Block level parallel vacuum (Mahendra Singh <mahi6run@gmail.com>) |
Responses |
Re: [HACKERS] Block level parallel vacuum
|
List | pgsql-hackers |
On Tue, 10 Dec 2019 at 00:30, Mahendra Singh <mahi6run@gmail.com> wrote: > > On Fri, 6 Dec 2019 at 10:50, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Thu, Dec 5, 2019 at 7:44 PM Robert Haas <robertmhaas@gmail.com> wrote: > > > > > > I think it might be a good idea to change what we expect index AMs to > > > do rather than trying to make anything that they happen to be doing > > > right now work, no matter how crazy. In particular, suppose we say > > > that you CAN'T add data on to the end of IndexBulkDeleteResult any > > > more, and that instead the extra data is passed through a separate > > > parameter. And then you add an estimate method that gives the size of > > > the space provided by that parameter (and if the estimate method isn't > > > defined then the extra parameter is passed as NULL) and document that > > > the data stored there might get flat-copied. > > > > > > > I think this is a good idea and serves the purpose we are trying to > > achieve currently. However, if there are any IndexAM that is using > > the current way to pass stats with additional information, they would > > need to change even if they don't want to use parallel vacuum > > functionality (say because their indexes are too small or whatever > > other reasons). I think this is a reasonable trade-off and the > > changes on their end won't be that big. So, we should do this. > > > > > Now, you've taken the > > > onus off of parallel vacuum to cope with any crazy thing a > > > hypothetical AM might be doing, and instead you've defined the > > > behavior of that hypothetical AM as wrong. If somebody really needs > > > that, it's now their job to modify the index AM machinery further > > > instead of your job to somehow cope. > > > > > > > makes sense. > > > > > > Here, we have a need to reduce the number of workers. Index Vacuum > > > > has two different phases (index vacuum and index cleanup) which uses > > > > the same parallel-context/DSM but both could have different > > > > requirements for workers. The second phase (cleanup) would normally > > > > need fewer workers as if the work is done in the first phase, second > > > > wouldn't need it, but we have exceptions like gin indexes where we > > > > need it for the second phase as well because it takes the pass > > > > over-index again even if we have cleaned the index in the first phase. > > > > Now, consider the case where we have 3 btree indexes and 2 gin > > > > indexes, we would need 5 workers for index vacuum phase and 2 workers > > > > for index cleanup phase. There are other cases too. > > > > > > > > We also considered to have a separate DSM for each phase, but that > > > > appeared to have overhead without much benefit. > > > > > > How about adding an additional argument to ReinitializeParallelDSM() > > > that allows the number of workers to be reduced? That seems like it > > > would be less confusing than what you have now, and would involve > > > modify code in a lot fewer places. > > > > > > > Yeah, we can do that. We can maintain some information in > > LVParallelState which indicates whether we need to reinitialize the > > DSM before launching workers. Sawada-San, do you see any problem with > > this idea? > > > > > > > > > Is there any legitimate use case for parallel vacuum in combination > > > > > with vacuum cost delay? > > > > > > > > > > > > > Yeah, we also initially thought that it is not legitimate to use a > > > > parallel vacuum with a cost delay. But to get a wider view, we > > > > started a separate thread [2] and there we reach to the conclusion > > > > that we need a solution for throttling [3]. > > > > > > OK, thanks for the pointer. This doesn't address the other part of my > > > complaint, though, which is that the whole discussion between you and > > > Dilip and Sawada-san presumes that you want the delays ought to be > > > scattered across the workers roughly in proportion to their share of > > > the I/O, and it seems NOT AT ALL clear that this is actually a > > > desirable property. You're all assuming that, but none of you has > > > justified it, and I think the opposite might be true in some cases. > > > > > > > IIUC, your complaint is that in some cases, even if the I/O rate is > > enough for one worker, we will still launch more workers and throttle > > them. The point is we can't know in advance how much I/O is required > > for each index. We can try to do that based on index size, but I > > don't think that will be right because it is possible that for the > > bigger index, we don't need to dirty the pages and most of the pages > > are in shared buffers, etc. The current algorithm won't use more I/O > > than required and it will be good for cases where one or some of the > > indexes are doing more I/O as compared to others and it will also work > > equally good even when the indexes have a similar amount of work. I > > think we could do better if we can predict how much I/O each index > > requires before actually scanning the index. > > > > I agree with the other points (add a FAST option for parallel vacuum > > and document that parallel vacuum is still potentially throttled ...) > > you made in a separate email. > > > > > > > You're adding extra complexity for something that isn't a clear > > > improvement. > > > > > > > Your understanding is correct. How about if we modify it to something > > > > like: "Note that parallel workers are alive only during index vacuum > > > > or index cleanup but the leader process neither exits from the > > > > parallel mode nor destroys the parallel context until the entire > > > > parallel operation is finished." OR something like "The leader backend > > > > holds the parallel context till the index vacuum and cleanup is > > > > finished. Both index vacuum and cleanup separately perform the work > > > > with parallel workers." > > > > > > How about if you just delete it? You don't need a comment explaining > > > that this caller of CreateParallelContext() does something which > > > *every* caller of CreateParallelContext() must do. If you didn't do > > > that, you'd fail assertions and everything would break, so *of course* > > > you are doing it. > > > > > > > Fair enough, we can just remove this part of the comment. > > > > Hi All, > Below is the brief about testing of v35 patch set. > > 1. > All the test cases are passing on the top of v35 patch set (make check world and all contrib test cases) > > 2. > By enabling PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION, "make check world" is passing. > > 3. > After v35 patch, vacuum.sql regression test is taking too much time due to large number of inserts so by reducing numberof tuples, we can reduce that time. > +INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM generate_series(1,100000) i; > > here, instead of 100000, we can make 1000 to reduce time of this test case because we only want to test code and functionality. As we added check of min_parallel_index_scan_size in v36 patch set to decide parallel vacuum, 1000 tuples are not enough to do parallel vacuum. I can see that we are not launching any workers in vacuum.sql test case and hence, code coverage also decreased. I am not sure that how to fix this. Thanks and Regards Mahendra Thalor EnterpriseDB: http://www.enterprisedb.com > > 4. > I tested functionality of parallel vacuum with different server configuration setting and behavior is as per expected. > shared_buffers, max_parallel_workers, max_parallel_maintenance_workers, vacuum_cost_limit, vacuum_cost_delay, maintenance_work_mem,max_worker_processes > > 5. > index and table stats of parallel vacuum are matching with normal vacuum. > > postgres=# select * from pg_statio_all_tables where relname = 'test'; > relid | schemaname | relname | heap_blks_read | heap_blks_hit | idx_blks_read | idx_blks_hit | toast_blks_read | toast_blks_hit| tidx_blks_read | tidx_blks_hit > -------+------------+---------+----------------+---------------+---------------+--------------+-----------------+----------------+----------------+--------------- > 16384 | public | test | 399 | 5000 | 3 | 0 | 0 | 0 | 0 | 0 > (1 row) > > 6. > vacuum Progress Reporting is as per expectation. > postgres=# select * from pg_stat_progress_vacuum; > pid | datid | datname | relid | phase | heap_blks_total | heap_blks_scanned | heap_blks_vacuumed | index_vacuum_count| max_dead_tuples | num_dead_tuples > -------+-------+----------+-------+---------------------+-----------------+-------------------+--------------------+--------------------+-----------------+----------------- > 44161 | 13577 | postgres | 16384 | cleaning up indexes | 41650 | 41650 | 41650 | 1 | 11184810 | 1000000 > (1 row) > > 7. > If any worker(or main worker) got error, then immediately all the workers are exiting and action is marked as abort. > > 8. > I tested parallel vacuum for all the types of indexes and by varying size of indexes, all are working and didn't got anyunexpected behavior. > > 9. > While doing testing, I found that if we delete all the tuples from table, then also size of btree indexes was not reducing. > > delete all tuples from table. > before vacuum, total pages in btree index: 8000 > after vacuum(normal/parallel), total pages in btree index: 8000 > but size of table is reducing after deleting all the tuples. > Can we add a check in vacuum to truncate all the pages of btree indexes if there is no tuple in table. > > Please let me know if you have any inputs for more testing. > > Thanks and Regards > Mahendra Thalor > EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: