Re: Eager aggregation, take 3 - Mailing list pgsql-hackers
From | Richard Guo |
---|---|
Subject | Re: Eager aggregation, take 3 |
Date | |
Msg-id | CAMbWs484dnecwXT2WzWFvzEmWPzC3U9F8SRDXg-SEegTYUFyXQ@mail.gmail.com Whole thread Raw |
In response to | Re: Eager aggregation, take 3 (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
On Sat, Sep 13, 2025 at 3:48 AM Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Sep 12, 2025 at 5:34 AM Richard Guo <guofenglinux@gmail.com> wrote: > > I really like this idea. Currently, aggtransspace represents an > > estimate of the transition state size provided by the aggregate > > definition. If it's set to zero, a default estimate based on the > > state data type is used. Negative values currently have no defined > > meaning. I think it makes perfect sense to reuse this field so that > > a negative value indicates that the transition state data can grow > > unboundedly in size. > > > > Attached 0002 implements this idea. It requires fewer code changes > > than I expected. This is mainly because that our current code uses > > aggtransspace in such a way that if it's a positive value, that value > > is used as it's provided by the aggregate definition; otherwise, some > > heuristics are applied to estimate the size. For the aggregates that > > accumulate input rows (e.g., array_agg, string_agg), I don't currently > > have a better heuristic for estimating their size, so I've chosen to > > keep the current logic. This won't regress anything in estimating > > transition state data size. > This might be OK, but it's not what I was suggesting: I was suggesting > trying to do a calculation like space_used = -aggtransspace * > rowcount, not just using a <0 value as a sentinel. I've considered your suggestion, but I'm not sure I'll adopt it in the end. Here's why: 1) At the point where we check whether any aggregates might pose a risk of excessive memory usage during partial aggregation, row count information is not yet available. You could argue that we could reorganize the logic to perform this check after we've had the row count, but that seems quite tricky. If I understand correctly, the "rowcount" in this context actually means the number of rows within one partial group. That would require us to first decide on the grouping expressions for the partial aggregation, then compute the group row counts, then estimate space usage, and only then decide whether memory usage is excessive and fall back. This would come quite late in planning and adds nontrivial overhead, compared to the current approach which checks at the very beginning. 2) Even if we were able to estimate space usage based on the number of rows per partial group and determined that memory usage seems acceptable, we still couldn't guarantee that the transition state data won't grow excessively after further joins. Joins can multiply partial aggregates, potentially causing a blowup in memory usage even if the initial estimate seemed safe. 3) I don't think "-aggtransspace * rowcount" reflects the true memory footprint for aggregates that accumulate input rows. For example, what if we have an aggregate like string_agg(somecolumn, 'a very long delimiter')? 4) AFAICS, the main downside of the current approach compared to yours is that it avoids pushing down aggregates like string_agg() that accumulate input rows, whereas your suggestion might allow pushing them down in some cases where we *think* it wouldn't blow up memory. You might argue that the current implementation is over-conservative. But I prefer to start safe. That said, I appreciate you proposing the idea of reusing aggtransspace, although I ended up using it in a different way than you suggested. - Richard
pgsql-hackers by date: