Re: Combining Aggregates - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Combining Aggregates |
Date | |
Msg-id | 569DC800.5010805@2ndquadrant.com Whole thread Raw |
In response to | Re: Combining Aggregates (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Combining Aggregates
|
List | pgsql-hackers |
On 01/19/2016 05:14 AM, Robert Haas wrote: > On Mon, Jan 18, 2016 at 12:34 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Mon, Jan 18, 2016 at 12:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Robert Haas <robertmhaas@gmail.com> writes: >>>> Yeah. The API contract for an expanded object took me quite a while >>>> to puzzle out, but it seems to be this: if somebody hands you an R/W >>>> pointer to an expanded object, you're entitled to assume that you can >>>> "take over" that object and mutate it however you like. But the >>>> object might be in some other memory context, so you have to move it >>>> into your own memory context. >>> >>> Only if you intend to keep it --- for example, a function that is >>> mutating and returning an object isn't required to move it >>> somewhere else, if the input is R/W, and I think it generally >>> shouldn't. >>> >>> In the context here, I'd think it is the responsibility of >>> nodeAgg.c not individual datatype functions to make sure that >>> expanded objects live where it wants them to. >> >> That's how I did it in my prototype, but the problem with that is that >> spinning up a memory context for every group sucks when there are many >> groups with only a small number of elements each - hence the 50% >> regression that David Rowley observed. If we're going to use expanded >> objects here, which seems like a good idea in principle, that's going >> to have to be fixed somehow. We're flogging the heck out of malloc by >> repeatedly creating a context, doing one or two allocations in it, and >> then destroying the context. >> >> I think that, in general, the memory context machinery wasn't really >> designed to manage lots of small contexts. The overhead of spinning >> up a new context for just a few allocations is substantial. That >> matters in some other situations too, I think - I've commonly seen >> AllocSetContextCreate taking several percent of runtime in profiles. >> But here it's much exacerbated. I'm not sure whether it's better to >> attack that problem at the root and try to make AllocSetContextCreate >> cheaper, or whether we should try to figure out some change to the >> expanded-object machinery to address the issue. > > Here is a patch that helps a good deal. I changed things so that when > we create a context, we always allocate at least 1kB. If that's more > than we need for the node itself and the name, then we use the rest of > the space as a sort of keeper block. I think there's more that can be > done to improve this, but I'm wimping out for now because it's late > here. > > I suspect something like this is a good idea even if we don't end up > using it for aggregate transition functions. I dare to claim that if expanded objects require a separate memory context per aggregate state (I don't see why would be the case), it's a dead end. Not so long ago we've fixed exactly this issue in array_agg(), which addressed a bunch of reported OOM issues and improved array_agg() performance quite a bit. It'd be rather crazy introducing the same problem to all aggregate functions. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: