Re: Statistical aggregate functions are not working with PARTIALaggregation - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Statistical aggregate functions are not working with PARTIALaggregation |
Date | |
Msg-id | 20190517030404.73izvqxgs3amfwd3@alap3.anarazel.de Whole thread Raw |
In response to | Re: Statistical aggregate functions are not working with PARTIALaggregation (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>) |
Responses |
Re: Statistical aggregate functions are not working with PARTIALaggregation
Re: Statistical aggregate functions are not working with PARTIAL aggregation |
List | pgsql-hackers |
Hi, On 2019-05-08 13:06:36 +0900, Kyotaro HORIGUCHI wrote: > This behavior is introduced by 69c3936a14 (in v11). At that time > FunctionCallInfoData is pallioc0'ed and has fixed length members > arg[6] and argnull[7]. So nulls[1] is always false even if nargs > = 1 so the issue had not been revealed. > After introducing a9c35cf85c (in v12) the same check is done on > FunctionCallInfoData that has NullableDatum args[] of required > number of elements. In that case args[1] is out of palloc'ed > memory so this issue has been revealed. > In a second look, I seems to me that the right thing to do here > is setting numInputs instaed of numArguments to numTransInputs in > combining step. Yea, to me this just seems a consequence of the wrong numTransInputs. Arguably this is a bug going back to 9.6, where combining aggregates where introduced. It's just that numTransInputs isn't used anywhere for combining aggregates, before 11. It's documentation says: /* * Number of aggregated input columns to pass to the transfn. This * includes the ORDER BY columns for ordered-set aggs, but not for plain * aggs. (This doesn't count the transition state value!) */ int numTransInputs; which IMO is violated by having it set to the plain aggregate's value, rather than the combine func. While I agree that fixing numTransInputs is the right way, I'm not convinced the way you did it is the right approach. I'm somewhat inclined to think that it's wrong that ExecInitAgg() calls build_pertrans_for_aggref() with a numArguments that's not actually right? Alternatively I think we should just move the numTransInputs computation into the existing branch around DO_AGGSPLIT_COMBINE. It seems pretty clear that this needs to be fixed for v11, it seems too fragile to rely on trans_fcinfo->argnull[2] being zero initialized. I'm less sure about fixing it for 9.6/10. There's no use of numTransInputs for combining back then. David, I assume you didn't adjust numTransInput plainly because it wasn't needed / you didn't notice? Do you have a preference for a fix? Independent of these changes, some of the code around partial, ordered set and polymorphic aggregates really make it hard to understand things: /* Detect how many arguments to pass to the finalfn */ if (aggform->aggfinalextra) peragg->numFinalArgs = numArguments + 1; else peragg->numFinalArgs = numDirectArgs + 1; What on earth is that supposed to mean? Sure, the +1 is obvious, but why the different sources for arguments are needed isn't - especially because numArguments was just calculated with the actual aggregate inputs. Nor is aggfinalextra's documentation particularly elucidating: /* true to pass extra dummy arguments to aggfinalfn */ bool aggfinalextra BKI_DEFAULT(f); especially not why aggfinalextra means we have to ignore direct args. Presumably because aggfinalextra just emulates what direct args does for ordered set args, but we allow both to be set. Similarly /* Detect how many arguments to pass to the transfn */ if (AGGKIND_IS_ORDERED_SET(aggref->aggkind)) pertrans->numTransInputs = numInputs; else pertrans->numTransInputs = numArguments; is hard to understand, without additional comments. One can, looking around, infer that it's because ordered set aggs need sort columns included. But that should just have been mentioned. And to make sense of build_aggregate_transfn_expr()'s treatment of direct args, one has to know that direct args are only possible for ordered set aggregates. Which IMO is not obvious in nodeAgg.c. ... I feel this code has become quite creaky in the last few years. Greetings, Andres Freund
pgsql-hackers by date: