Re: [HACKERS] Parallel Index Scans - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: [HACKERS] Parallel Index Scans |
Date | |
Msg-id | CAA4eK1+TnM4pXQbvn7OXqam+k_HZqb0ROZUMxOiL6DWJYCyYow@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Parallel Index Scans (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: [HACKERS] Parallel Index Scans
Re: [HACKERS] Parallel Index Scans |
List | pgsql-hackers |
On Sat, Feb 4, 2017 at 7:14 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Sat, Feb 4, 2017 at 5:54 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Wed, Feb 1, 2017 at 12:58 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: > >> On balance, I'm somewhat inclined to think that we ought to base >> everything on heap pages, so that we're always measuring in the same >> units. That's what Dilip's patch for parallel bitmap heap scan does, >> and I think it's a reasonable choice. However, for parallel index >> scan, we might want to also cap the number of workers to, say, >> index_pages/10, just so we don't pick an index scan that's going to >> result in a very lopsided work distribution. >> > > I guess in the above context you mean heap_pages or index_pages that > are expected to be *fetched* during index scan. > > Yet another thought is that for parallel index scan we use > index_pages_fetched, but use either a different GUC > (min_parallel_index_rel_size) with a relatively lower default value > (say equal to min_parallel_relation_size/4 = 2MB) or directly use > min_parallel_relation_size/4 for parallel index scans. > I had some offlist discussion with Robert about the above point and we feel that keeping only heap pages for parallel computation might not be future proof as for parallel index only scans there might not be any heap pages. So, it is better to use separate GUC for parallel index (only) scans. We can have two guc's min_parallel_table_scan_size (8MB) and min_parallel_index_scan_size (512kB) for computing parallel scans. The parallel sequential scan and parallel bitmap heap scans can use min_parallel_table_scan_size as a threshold to compute parallel workers as we are doing now. For parallel index scans, both min_parallel_table_scan_size and min_parallel_index_scan_size can be used for threshold; We can compute parallel workers both based on heap_pages to be scanned and index_pages to be scanned and then keep the minimum of those. This will help us to engage parallel index scans when the index pages are lower than threshold but there are many heap pages to be scanned and will also allow keeping a maximum cap on the number of workers based on index scan size. guc_parallel_index_scan_v1.patch - Change name of existing min_parallel_relation_size to min_parallel_table_scan_size and added a new guc min_parallel_index_scan_size with default value of 512kB. This patch also adjusted the computation in compute_parallel_worker based on two guc's. compute_index_pages_v2.patch - This function extracts the computation of index pages to be scanned in a separate function and used it in existing code. You will notice that I have pulled up the logic of conversion of clauses to indexquals from create_index_path to build_index_paths as that is required to compute the number of index and heap pages to be scanned by scan in patch parallel_index_opt_exec_support_v8.patch. This doesn't impact any existing functionality. parallel_index_scan_v7 - patch to parallelize btree scans, nothing is changed from previous version (just rebased on latest head). parallel_index_opt_exec_support_v8.patch - This contain changes to compute parallel workers using both heap and index pages that need to be scanned. Patches guc_parallel_index_scan_v1.patch and compute_index_pages_v2.patch are independent patches. Both the patches are required by parallel index scan patches. The current set of patches handles all the reported comments. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: