Re: Parallel bitmap heap scan - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Parallel bitmap heap scan |
Date | |
Msg-id | CAA4eK1+04U34haLjpaPkEpV2nPmj0NPmswYjZVXY0vo6=GoGrg@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel bitmap heap scan (Dilip Kumar <dilipbalaut@gmail.com>) |
Responses |
Re: Parallel bitmap heap scan
|
List | pgsql-hackers |
On Wed, Oct 19, 2016 at 9:23 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote: > On Wed, Oct 19, 2016 at 12:39 PM, Andres Freund <andres@anarazel.de> wrote: >> Try measuring with something more heavy on bitmap scan time >> itself. E.g. >> SELECT SUM(l_extendedprice) FROM lineitem WHERE (l_shipdate >= '1995-01-01'::date) AND (l_shipdate <= '1996-12-31'::date); >> or similar. The tpch queries don't actually spend that much time in the >> bitmapscan itself - the parallization of the rest of the query is what >> matters... > > Yeah, I agree. > > I have tested with this query, with exact filter condition it was > taking parallel sequence scan, so I have modified the filter a bit and > tested. > > Tested with all default configuration in my local machine. I think I > will generate more such test cases and do detail testing in my > performance machine. > > > Explain Analyze results: > --------------------------------- > On Head: > ------------ > postgres=# explain analyze SELECT SUM(l_extendedprice) FROM lineitem > WHERE (l_shipdate >= '1995-01-01'::date) AND (l_shipdate <= > '1996-03-31'::date); > > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=848805.90..848805.91 rows=1 width=32) (actual > time=12440.165..12440.166 rows=1 loops=1) > -> Bitmap Heap Scan on lineitem (cost=143372.40..834833.25 > rows=5589057 width=8) (actual time=1106.217..11183.722 rows=5678841 > loops=1) > Recheck Cond: ((l_shipdate >= '1995-01-01'::date) AND > (l_shipdate <= '1996-03-31'::date)) > Rows Removed by Index Recheck: 20678739 > Heap Blocks: exact=51196 lossy=528664 > -> Bitmap Index Scan on idx_lineitem_shipdate > (cost=0.00..141975.13 rows=5589057 width=0) (actual > time=1093.376..1093.376 rows=5678841 loops=1) > Index Cond: ((l_shipdate >= '1995-01-01'::date) AND > (l_shipdate <= '1996-03-31'::date)) > Planning time: 0.185 ms > Execution time: 12440.819 ms > (9 rows) > > After Patch: > --------------- > postgres=# explain analyze SELECT SUM(l_extendedprice) FROM lineitem > WHERE (l_shipdate >= '1995-01-01'::date) AND (l_shipdate <= > '1996-03-31'::date); > > QUERY PLAN > > -------------------------------------------------------------------------------------------------------------------------------------------------------------- > --------- > Finalize Aggregate (cost=792751.16..792751.17 rows=1 width=32) > (actual time=6660.157..6660.157 rows=1 loops=1) > -> Gather (cost=792750.94..792751.15 rows=2 width=32) (actual > time=6659.378..6660.117 rows=3 loops=1) > Workers Planned: 2 > Workers Launched: 2 > -> Partial Aggregate (cost=791750.94..791750.95 rows=1 > width=32) (actual time=6655.941..6655.941 rows=1 loops=3) > -> Parallel Bitmap Heap Scan on lineitem > (cost=143372.40..785929.00 rows=2328774 width=8) (actual > time=1980.797..6204.232 rows=1892947 loops= > 3) > Recheck Cond: ((l_shipdate >= '1995-01-01'::date) > AND (l_shipdate <= '1996-03-31'::date)) > Rows Removed by Index Recheck: 6930269 > Heap Blocks: exact=17090 lossy=176443 > -> Bitmap Index Scan on idx_lineitem_shipdate > (cost=0.00..141975.13 rows=5589057 width=0) (actual > time=1933.454..1933.454 rows=5678841 > loops=1) > Index Cond: ((l_shipdate >= > '1995-01-01'::date) AND (l_shipdate <= '1996-03-31'::date)) > Planning time: 0.207 ms > Execution time: 6669.195 ms > (13 rows) > > > Summary: > -> With patch overall execution is 2 time faster compared to head. > -> Bitmap creation with patch is bit slower compared to head and thats > because of DHT vs efficient hash table. > I think here the impact of slowness due to Bitmap Index Scan is not much visible, as the time it takes as compare to overall time is less. However, I think there is an advantage of using DHT as that will allow us to build the hash table by multiple workers using parallel index scan in future. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: