Re: array_accum aggregate - Mailing list pgsql-hackers
From | Stephen Frost |
---|---|
Subject | Re: array_accum aggregate |
Date | |
Msg-id | 20061006220520.GK24675@kenobi.snowman.net Whole thread Raw |
In response to | Re: array_accum aggregate (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: array_accum aggregate
|
List | pgsql-hackers |
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > Long story short, I set out to build a faster array_accum. Much to my > > suprise and delight, we already *had* one. accumArrayResult() and > > makeArrayResult()/construct_md_array() appear to do a fantastic job. > > I've created a couple of 'glue' functions to expose these functions so > > they can be used in an aggregate. I'm sure they could be improved > > upon and possibly made even smaller than they already are (90 lines > > total for both) but I'd like to throw out the idea of including them > > in core. The aggregate created with them could also be considered for > > inclusion though I'm less concerned with that. > > Since you've set up the functions to only be usable inside an aggregate, > I don't see much of a reason why we wouldn't provide the aggregate too. Sure, I don't see a reason these functions would be useful outside of an aggregate in the form they need to be in for the aggregate, either.. > It looks like it should work to have just one polymorphic aggregate > definition, eg, array_accum(anyelement) returns anyarray. I was hoping to do that, but since it's an aggregate the ffunc format is pre-defined to require accepting the 'internal state' and nothing else, and to return 'anyelement' or 'anyarray' one of the inputs must be an 'anyelement' or 'anyarray', aiui. That also implied to me that the type passed in was expected to be the type passed out, which wouldn't necessairly be the case here as the internal state variable is a bytea. It might be possible to do away with the internal state variable entirely though and make it an anyelement instead, if we can find somewhere else to put the pointer to the ArrayBuildState. > As far as coding style goes, you're supposed to use makeArrayResult() > with accumArrayResult(), not call construct_md_array() directly. And > copying the ArrayBuildState around like that is just plain bizarre... I had attempted to use makeArrayResult() originally and ran into some trouble with the MemoryContextDelete() which is done during it. The context used was the AggState context and therefore was deleted elsewhere. That might have been a misunderstanding on my part though since I recall fixing at least one or two bugs afterwards, so it may be possible to go back and change to using makeArrayResult(). That'd certainly make the ffunc smaller. As for ArrayBuildState, I'm not actually copying the structure around, just the pointer is being copied around inside of the state transistion variable (which is a bytea). I'm not sure where else to store the pointer to the ArrayBuildState though, since multiple could be in play at a given time during an aggregation, aiui. > Does the thing work without memory leakage (for a pass-by-ref datatype) > in a GROUP BY situation? I expected that using the AggState context would handle free'ing the memory at the end of the aggregation as I believe that's the context used for the state variable itself as well. Honestly, I wasn't entirely sure how to create my own context or if that was the correct thing to do in this case. I'll see about changing it around to do that if it's acceptable to have a context created for each instance of a group-by aggregate, and I can figure out how. :) I'm not sure about memory leakage during a Sort+GroupAgg.. I don't know how that's done currently either, honestly. It seems like memory could be free'd during that once the ffunc is called, and an extra memory context could do that, is that what you're referring to? Thanks! Stephen
pgsql-hackers by date: