Parallel Index Scans - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Parallel Index Scans |
Date | |
Msg-id | CAA4eK1J9GQPN0UaXKX8XnW4M-OeUFja4+ySt4=7FfRJBUxZ9vQ@mail.gmail.com Whole thread Raw |
Responses |
Re: Parallel Index Scans
|
List | pgsql-hackers |
As of now, the driving table for parallel query is accessed by parallel sequential scan which limits its usage to a certain degree. Parallelising index scans would further increase the usage of parallel query in many more cases. This patch enables the parallelism for the btree scans. Supporting parallel index scan for other index types like hash, gist, spgist can be done as separate patches. The basic idea is quite similar to parallel heap scans which is that each worker (including leader whenever possible) will scan a block and then get the next block that is required to be scan. The parallelism in implemented at the leaf level of a btree. The first worker to start a btree scan will scan till leaf and others will wait till the first worker has reached till leaf. The first worker after reading the leaf block will set the next block to be read and wake the first worker waiting to scan the next block and proceed with scanning tuples from the block it has read, similarly each worker after reading the block, sets the next block to be read and wakes up the first waiting worker. This is achieved by using the condition variable patch [1] proposed by Robert. Parallelism is supported for both forward and backward scans. The optimizer will choose the parallelism based on number of pages in index relation and cpu cost for evaluating the rows is divided equally among workers. Index Scan node is made parallel aware and can be used beneath Gather as shown below: Current Plan for Index Scans ---------------------------------------- Index Scan using idx2 on test (cost=0.42..7378.96 rows=2433 width=29) Index Cond: (c < 10) Parallel version of plan ---------------------------------- Gather (cost=1000.42..1243.40 rows=2433 width=29) Workers Planned: 1 -> Parallel Index Scan using idx2 on test (cost=0.42..0.10 rows=1431 width=29) Index Cond: (c < 10) The Parallel index scans can be used in parallelising aggregate queries as well. For example, given a query like: select count(*) from t1 where c1 > 1000 and c1 < 1100 and c2='aaa' Group By c2; below form of parallel plans are possible: Finalize HashAggregate Group Key: c2 -> Gather Workers Planned: 1 -> Partial HashAggregate Group Key: c2 -> Parallel Index Scan using idx_t1_partial on t1 Index Cond: ((c1 > 1000) AND (c1 < 1100)) Filter: (c2 = 'aaa'::bpchar) OR Finalize GroupAggregate Group Key: c2 -> Sort -> Gather Workers Planned: 1 -> Partial GroupAggregate Group Key: c2 -> Parallel Index Scan using idx_t1_partial on t1 Index Cond: ((c1 > 1000) AND (c1 < 1100)) Filter: (c2 = 'aaa'::bpchar) In the second plan (GroupAggregate), the Sort + Gather step would be replaced with GatherMerge, once we have a GatherMerge node as proposed by Rushabh [2]. Note, that above examples are just taken to explain the usage of parallel index scan, actual plans will be selected based on cost. Performance tests ---------------------------- This test has been performed on community m/c (hydra, POWER-7). Initialize pgbench with 3000 scale factor (./pgbench -i -s 3000 postgres) Count the rows in pgbench_accounts based on values of aid and bid Serial plan ------------------ set max_parallel_workers_per_gather=0; postgres=# explain analyze select count(aid) from pgbench_accounts where aid > 1000 and aid < 90000000 and bid > 800 and bid < 900; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---- Aggregate (cost=4714590.52..4714590.53 rows=1 width=8) (actual time=35684.425..35684.425 rows=1 loops=1) -> Index Scan using pgbench_accounts_pkey on pgbench_accounts (cost=0.57..4707458.12 rows=2852961 width=4) (actual time=29210.743..34385.271 rows=9900000 loops =1) Index Cond: ((aid > 1000) AND (aid < 90000000)) Filter: ((bid > 800) AND (bid < 900)) Rows Removed by Filter: 80098999 Planning time: 0.183 ms Execution time: 35684.459 ms (7 rows) Parallel Plan ------------------- set max_parallel_workers_per_gather=2; postgres=# explain analyze select count(aid) from pgbench_accounts where aid > 1000 and aid < 90000000 and bid > 800 and bid < 900; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------ --------------------------------- Finalize Aggregate (cost=3924773.13..3924773.14 rows=1 width=8) (actual time=15033.105..15033.105 rows=1 loops=1) -> Gather (cost=3924772.92..3924773.12 rows=2 width=8) (actual time=15032.986..15033.093 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 -> Partial Aggregate (cost=3923772.92..3923772.92 rows=1 width=8) (actual time=15030.354..15030.354 rows=1 loops=3) -> Parallel Index Scan using pgbench_accounts_pkey on pgbench_accounts (cost=0.57..3920801.08 rows=1188734 width=4) (actual time=12476.068..14600.410 rows=3300000 loops=3) Index Cond: ((aid > 1000) AND (aid < 90000000)) Filter: ((bid > 800) AND (bid < 900)) Rows Removed by Filter: 26699666 Planning time: 0.244 ms Execution time: 15036.081 ms (11 rows) The above is a median of 3 runs, all the runs gave almost same execution time. Here, we can notice that execution time is reduced by more than half with two workers and I have tested with four workers where time is reduced to one-fourth (9128.420 ms) of serial plan. I think these results are quite similar to what we got for parallel sequential scans. Another thing to note is that parallelising index scans are more beneficial if there is a Filter which removes many rows fetched from Index Scan or if the Filter is costly (example - filter contains costly function execution). This observation is also quite similar to what we have observed with Parallel Sequential Scans. I think we can parallelise Index Only Scans as well, but I have not evaluated the same and certainly it can be done as a separate patch in future. Contributions -------------------- First patch (parallel_index_scan_v1.patch) implements parallelism at IndexAM level - Rahila Syed and Amit Kapila based on design inputs and suggestions by Robert Haas Second patch (parallel_index_opt_exec_support_v1.patch) provides optimizer and executor support for parallel index scans - Amit Kapila The order to use these patches is first apply condition variable patch [1] then parallel_index_scan_v1.patch and then parallel_index_opt_exec_support_v1.patch Thoughts? [1] - https://www.postgresql.org/message-id/CAEepm%3D0zshYwB6wDeJCkrRJeoBM%3DjPYBe%2B-k_VtKRU_8zMLEfA%40mail.gmail.com [2] - https://www.postgresql.org/message-id/CAGPqQf09oPX-cQRpBKS0Gq49Z%2Bm6KBxgxd_p9gX8CKk_d75HoQ%40mail.gmail.com -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Attachment
pgsql-hackers by date: