Re: BUG #16627: union all with partioned table yields random aggregate results - Mailing list pgsql-bugs
From | David Rowley |
---|---|
Subject | Re: BUG #16627: union all with partioned table yields random aggregate results |
Date | |
Msg-id | CAApHDvpm+YbPqCFQESE9h7Jx8-XX0vrJtjEySE8t-D8HHZW11A@mail.gmail.com Whole thread Raw |
In response to | RE: BUG #16627: union all with partioned table yields random aggregate results (Brian Kanaga <kanaga@consumeracquisition.com>) |
Responses |
Re: BUG #16627: union all with partioned table yields random aggregate results
|
List | pgsql-bugs |
Hi Brian, On Thu, 24 Sep 2020 at 01:50, Brian Kanaga <kanaga@consumeracquisition.com> wrote: > Attached is a much-simplified version of the problem query along with screen > shots of plans and what tweaks to the query produce changes to the plan. Thanks for sending those. (For the future, attaching a text file with the queries and explain output would be much easier to work with. I personally tend to compare these sorts of things in a text compare tool. Playing spot the difference with images is more tricky.) Comparing the images you attached it does look like the index scan on fb_add_daily_archive_2019_pkey found more rows in the parallel version of the scan. 262 * 4 = 1048, but only 826 on the "noissue.png" plan. In the cases you've shown that were incorrect, the aggregated value is larger. So assuming you're always aggregating positive values then the incorrect result does hint that something is getting more rows than it should. The row counts I see indicate that's the case with ~1048 in the error case and only 826 in the correct result case. It would be good to get the full text of the EXPLAIN ANALYZE to confirm those predicates match properly. That part was chopped off the screenshot. I noticed that the "issue.png" plan has a nested Parallel Append, the outer of which has a mix of parallel and parallel safe paths. I'm not sure how relevant that is, but having nested parallel appends is probably not that common. I played around with the following trying to produce a similar plan with a nested parallel append with a mix of parallel and parallel safe paths. Trying this on 11.4 I didn't see any executions with the incorrect tuple count. drop table t; drop table pt; create table pt (a int) partition by range(a); create table pt1 partition of pt for values from (0) to (3000000); create index on pt1 (a); create table t (a int primary key); insert into t select x from generate_Series(1,2000000)x; insert into pt select x from generate_series(1,2000000)x; alter table t set (parallel_workers=0); set enable_bitmapscan=0; set enable_indexonlyscan=0; set work_mem = '200MB'; select count(*) from (select * from t where a between 100000 and 200000 union all select * from t where a between 200000 and 300000 union all select * from pt where a between 900000 and 999999) t; > I have tried to recreate this for you in a dump file but I could not get it > to happen without including gobs of data. Even tweaking the plan I could > not get the filter part evaluating to match the offending plan. How large is the dataset? and if the data was properly anonymised, and the size wasn't too insane, would you be allowed to share it? privately would be an option. David
pgsql-bugs by date: