Re: [PATCH] Negative Transition Aggregate Functions (WIP) - Mailing list pgsql-hackers
From | Florian Pflug |
---|---|
Subject | Re: [PATCH] Negative Transition Aggregate Functions (WIP) |
Date | |
Msg-id | 62144D8A-FB0D-420B-ABDC-C478C6A4DA79@phlo.org Whole thread Raw |
In response to | Re: [PATCH] Negative Transition Aggregate Functions (WIP) (Dean Rasheed <dean.a.rasheed@gmail.com>) |
Responses |
Re: [PATCH] Negative Transition Aggregate Functions (WIP)
|
List | pgsql-hackers |
On Jan27, 2014, at 23:28 , Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > strict transfn vs non-strict inv_transfn > ---------------------------------------- > > This case is explicitly forbidden, both in CREATE AGGREGATE and in the > executor. To me, that seems overly restrictive --- if transfn is > strict, then you know for sure that no NULL values are added to the > aggregate state, so surely it doesn't matter whether or not > inv_transfn is strict. It will never be asked to remove NULL values. > > I think there are definite use-cases where a user might want to use a > pre-existing function as the inverse transition function, so it seems > harsh to force them to wrap it in a strict function in this case. > > AFAICS, the code in advance/retreat_windowaggregate should just work > if those checks are removed. True. It didn't use to in earlier version of the patch because the advance and retreat functions looked at the strict settings directly, but now that there's an ignore_nulls flag in the executor node, the only requirement should be that ignore_nulls is set if either of the transition functions is strict. I'm not sure that the likelihood of someone wanting to combine a strict forward with a non-strict inverse function is very hight, but neither > non-strict transfn vs strict inv_transfn > ---------------------------------------- > > At first this seems as though it must be an error --- the forwards > transition function allows NULL values into the aggregate state, and > the inverse transition function is strict, so it cannot remove them. > > But actually what the patch is doing in this case is treating the > forwards transition function as partially strict --- it won't be > passed NULL values (at least in a window context), but it may be > passed a NULL state in order to build the initial state when the first > non-NULL value arrives. Exactly. > It looks like this behaviour is intended to support aggregates that > ignore NULL values, but cannot actually be made to have a strict > transition function. I think typically this is because the aggregate's > initial state is NULL and it's state type differs from the type of the > values being aggregated, and so can't be automatically created from > the first non-NULL value. Yes. I added this because the alternative would haven been to count non-NULL values in most of the existing SUM() aggregates. > That all seems quite ugly though, because now you have a transition > function that is not strict, which is passed NULL values when used in > a normal aggregate context, and not passed NULL values when used in a > window context (whether sliding or not), except for the NULL state for > the first non-NULL value. Ugh, true. Clearly, nodeAgg.c needs to follow what nodeWindowAgg.c does and skip NULL inputs if the aggregate has a strict inverse transition function. That fact that we never actually *call* the inverse doesn't matter. Will fix that once we decided on the various strictness issues. > I'm not sure if there is a better way to do it though. If we disallow > this case, these aggregates would have to use non-strict functions for > both the forward and inverse transition functions, which means they > would have to implement their own non-NULL value counting. > Alternatively, allowing strict transition functions for these > aggregates would require that we provide some other way to initialise > the state from the first non-NULL input, such as a new initfn. I'm not sure an initfn would really help. It would allow us to return to the initial requirement that the strict settings match - but you already deem the check that the forward transition function can only be strict if the inverse is also overly harsh, so would that really be an improvement? It's also cost us an additional pg_proc entry per aggregate. Another idea would be to have an explicit nulls=ignore|process option for CREATE AGGREGATE. If nulls=process and either of the transition functions are strict, we could either error out, or simply do what normal functions calls do and pretend they return NULL for NULL inputs. Not sure how the rule that forward transition functions may not return NULL if there's an inverse transition function would fit in if we do the latter, though. The question is - is it worth it the effort to add that flag? best regards, Florian Pflug
pgsql-hackers by date: