Re: [HACKERS] PoC: Grouped base relation - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: [HACKERS] PoC: Grouped base relation |
Date | |
Msg-id | cb2065d8-ae20-4fc9-f0bb-9aab47725d84@2ndquadrant.com Whole thread Raw |
In response to | Re: [HACKERS] PoC: Grouped base relation (Antonin Houska <ah@cybertec.at>) |
Responses |
Re: [HACKERS] PoC: Grouped base relation
|
List | pgsql-hackers |
On 01/17/2017 08:05 PM, Antonin Houska wrote: > [ Trying to respond to both Tomas and David. I'll check tomorrow if anything > else of the thread needs my comment. ] > > Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > >> On 01/17/2017 12:42 AM, David Rowley wrote: >>> On 10 January 2017 at 06:56, Antonin Houska <ah@cybertec.at> wrote: > >>> I've been thinking about this aggtransmultifn and I'm not sure if it's >>> really needed. Adding a whole series of new transition functions is >>> quite a pain. At least I think so, and I have a feeling Robert might >>> agree with me. >>> >>> Let's imagine some worst case (and somewhat silly) aggregate query: >>> >>> SELECT count(*) >>> FROM million_row_table >>> CROSS JOIN another_million_row_table; >>> >>> Today that's going to cause 1 TRILLION transitions! Performance will >>> be terrible. >>> >>> If we pushed the aggregate down into one of those tables and performed >>> a Partial Aggregate on that, then a Finalize Aggregate on that single >>> row result (after the join), then that's 1 million transfn calls, and >>> 1 million combinefn calls, one for each row produced by the join. >>> >>> If we did it your way (providing I understand your proposal correctly) >>> there's 1 million transfn calls on one relation, then 1 million on the >>> other and then 1 multiplyfn call. which does 1000000 * 1000000 >>> >>> What did we save vs. using the existing aggcombinefn infrastructure >>> which went into 9.6? Using this actually costs us 1 extra function >>> call, right? I'd imagine the size of the patch to use aggcombinefn >>> instead would be a fraction of the size of the one which included all >>> the new aggmultiplyfns and pg_aggregate.h changes. >>> > >> I think the patch relies on the assumption that the grouping reduces >> cardinality, > > Yes. > >> so a CROSS JOIN without a GROUP BY clause may not be the best >> counterexample. > > Yet it tells me that my approach is not ideal in some cases ... > >>> It sounds like a much more manageable project by using aggcombinefn >>> instead. Then maybe one day when we can detect if a join did not cause >>> any result duplication (i.e Unique Joins), we could finalise the >>> aggregates on the first call, and completely skip the combine state >>> altogether. >>> > >> I don't quite see how the patch could use aggcombinefn without sacrificing a >> lot of the benefits. Sure, it's possible to run the aggcombinefn in a loop >> (with number of iterations determined by the group size on the other side of >> the join), but that sounds pretty expensive and eliminates the reduction of >> transition function calls. The join cardinality would still be reduced, >> though. > > That's what I think. The generic case is that neither side of the join is > unique. If it appears that both relations should be aggregated below the join, > aggcombinefn would have to be called multiple times on each output row of the > join to achieve the same result as the calling aggmultiplyfn. > >> I do have other question about the patch, however. It seems to rely on the >> fact that the grouping and joins both reference the same columns. I wonder how >> uncommon such queries are. >> >> To give a reasonable example, imagine the typical start schema, which is >> pretty standard for large analytical databases. A dimension table is >> "products" and the fact table is "sales", and the schema might look like this: >> >> CREATE TABLE products ( >> id SERIAL PRIMARY KEY, >> name TEXT, >> category_id INT, >> producer_id INT >> ); >> >> CREATE TABLE sales ( >> product_id REFERENCES products (id), >> nitems INT, >> price NUMERIC >> ); >> >> A typical query then looks like this: >> >> SELECT category_id, SUM(nitems), SUM(price) >> FROM products p JOIN sales s ON (p.id = s.product_id) >> GROUP BY p.category_id; >> >> which obviously uses different columns for the grouping and join, and so the >> patch won't help with that. Of course, a query grouping by product_id would >> allow the patch to work > > Right, the current version does not handle this. Thanks for suggestion.> So you're saying it's merely a limitation of the initial patch version and not an inherent limitation? > >> Another thing is that in my experience most queries do joins on foreign keys >> (so the PK side is unique by definition), so the benefit on practical examples >> is likely much smaller. > > ok. So in some cases the David's approach might be better. > In which cases would David's approach be more efficient? But even if there are such cases, I assume we could generate both paths and decide based on cost, just like with all other alternative paths. > > However I think the ability to join 2 grouped (originally non-unique) > relations is still important. Consider a query involving "sales" as well as > another table which also has many-to-one relationship to "products". > Well, can you give a practical example? What you describe seems like a combination of two fact tables + a dimension, something like this: CREATE TABLE products ( id SERIAL PRIMARY KEY, name TEXT, category_id INT, producer_id INT ); CREATE TABLE sales ( product_id REFERENCES products (id), nitems INT, price NUMERIC ); CREATE TABLE reviews ( product_id REFERENCES products (id), stars INT ); But how exactly do you join that together? Because SELECT * FROM products p JOIN sales s ON (p.id = s.product_id) JOIN reviews r ON (p.id =r.product_id) is clearly wrong - it's essentially M:N join between the two fact tables, increasing the number of rows. It'd helpful to have an example of a practical query optimized by the patch. I'm not claiming it does not exist, but I've been unable to come up with something reasonable at the moment. >> But I guess my main question is if there are actual examples of queries the >> patch is trying to improve, or whether the general benefit is allowing >> parallel plans for queries where it would not be possible otherwise. > > In fact I did all this with postgres_fdw in mind. > I assume there's not much difference between pushing down aggregates to local workers and to remote nodes. There'll be costing differences, but are there any other differences? > > From this perspective, David's approach can be slightly more efficient if all > the tables are local, but aggregation of multiple base relations below the > join can save a lot of effort if the tables are remote (as it reduces the > amount of data transferred over network). > > I'm not terribly happy about changing the system catalog, but adding something > like pg_aggregate(aggtransmultifn) is currently the best idea I have. > I personally don't see that as a major problem, my impression it can be mostly copied from the partial aggregate patch - it's not trivial, but manageable. Propagating it to the optimizer will be less trivial, but well, if it's necessary ... regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: