Re: Parallel Aggregate - Mailing list pgsql-hackers
From | Haribabu Kommi |
---|---|
Subject | Re: Parallel Aggregate |
Date | |
Msg-id | CAJrrPGcGn0HUfuqQ6JRe321kBEdz0dTjF9Fatro2X_kJgRc5cA@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel Aggregate (David Rowley <david.rowley@2ndquadrant.com>) |
Responses |
Re: Parallel Aggregate
Re: Parallel Aggregate |
List | pgsql-hackers |
On Fri, Jan 22, 2016 at 7:44 AM, David Rowley <david.rowley@2ndquadrant.com> wrote: > On 21 January 2016 at 18:26, Haribabu Kommi <kommi.haribabu@gmail.com> wrote: >> Here I attached update parallel aggregate patch on top of recent commits >> of combine aggregate and parallel join patch. It still lacks of cost comparison >> code to compare both parallel and normal aggregates. > > Thanks for the updated patch. > > I'm just starting to look over this now. > > # create table t1 as select x from generate_series(1,1000000) x(x); > # vacuum ANALYZE t1; > # set max_parallel_degree =8; > # explain select sum(x) from t1; > QUERY PLAN > ------------------------------------------------------------------------- > Aggregate (cost=9633.33..9633.34 rows=1 width=4) > -> Parallel Seq Scan on t1 (cost=0.00..8591.67 rows=416667 width=4) > (2 rows) > > I'm not quite sure what's happening here yet as I've not ran it > through my debugger, but how can we have a Parallel Seq Scan without a > Gather node? It appears to give correct results, so I can only assume > it's not actually a parallel scan at all. > > Let's check: > > # select relname,seq_scan from pg_stat_user_tables where relname ='t1'; > relname | seq_scan > ---------+---------- > t1 | 0 > (1 row) > > # explain analyze select sum(x) from t1; > QUERY PLAN > -------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=9633.33..9633.34 rows=1 width=4) (actual > time=161.820..161.821 rows=1 loops=1) > -> Parallel Seq Scan on t1 (cost=0.00..8591.67 rows=416667 > width=4) (actual time=0.051..85.348 rows=1000000 loops=1) > Planning time: 0.040 ms > Execution time: 161.861 ms > (4 rows) > > # select relname,seq_scan from pg_stat_user_tables where relname ='t1'; > relname | seq_scan > ---------+---------- > t1 | 1 > (1 row) > > Only 1 scan. > > > # explain analyze select * from t1 where x=1; > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------- > Gather (cost=1000.00..10633.43 rows=1 width=4) (actual > time=0.231..49.105 rows=1 loops=1) > Number of Workers: 2 > -> Parallel Seq Scan on t1 (cost=0.00..9633.33 rows=0 width=4) > (actual time=29.060..45.302 rows=0 loops=3) > Filter: (x = 1) > Rows Removed by Filter: 333333 > Planning time: 0.049 ms > Execution time: 51.438 ms > (7 rows) > > # select relname,seq_scan from pg_stat_user_tables where relname ='t1'; > relname | seq_scan > ---------+---------- > t1 | 4 > (1 row) > > 3 more scans. This one seems to actually be parallel, and makes sense > based on "Number of Workers: 2" The problem was the gather path that is generated on partial path list is not getting added to path list, because of which, there is a mismatch in sorted path and cheapest_path, so it leads to a wrong plan. For temporarily, I marked the sorted_path and cheapest_path as same and it works fine. > Also looking at the patch: > > +bool > +aggregates_allow_partial(Node *clause) > +{ > > In the latest patch that I sent on the combine aggregates thread: > http://www.postgresql.org/message-id/CAKJS1f_in9J_ru4gPfygCQLUeB3=RzQ3Kg6RnPN-fzzhdDiyvg@mail.gmail.com > I made it so there's 3 possible return values from this function. As > your patch stands now, if I create an aggregate function with an > INTERNAL state with a combine function set, then this patch might try > to parallel aggregate that and pass around the pointer to the internal > state in the Tuple going from the worker to the main process, when the > main process dereferences this pointer we'll get a segmentation > violation. So I'd say you should maybe use a modified version of my > latest aggregates_allow_partial() and check for PAT_ANY, and only > parallelise the aggregate if you get that value. If the use of > partial aggregate was within a single process then you could be quite > content with PAT_INTERNAL_ONLY. You'll just need to pull out the logic > that checks for serial and deserial functions, since that's not in > yet, and just have it return PAT_INTERNAL_ONLY if INTERNAL aggregates > are found which have combine functions set. > I took the suggested code changes from combine aggregate patch and changed accordingly. Along with these changes, I added a float8 combine function to see how it works under parallel aggregate, it is working fine for float4, but giving small data mismatch with float8 data type. postgres=# select avg(f3), avg(f4) from tbl; avg | avg ------------------+------------------ 1.10000002384186 | 100.123449999879 (1 row) postgres=# set enable_parallelagg = true; SET postgres=# select avg(f3), avg(f4) from tbl; avg | avg ------------------+------------------ 1.10000002384186 | 100.123449999918 (1 row) Column - f3 - float4 Column - f4 - float8 similar problem for all float8 var_pop, var_samp, stddev_pop and stddev_samp aggregates. Any special care is needed for float8 datatype? Regards, Hari Babu Fujitsu Australia
Attachment
pgsql-hackers by date: