Home > mailing lists
Re: [HACKERS] PoC: Grouped base relation - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: [HACKERS] PoC: Grouped base relation
Date	January 18, 2017 03:11:55
Msg-id	cb2065d8-ae20-4fc9-f0bb-9aab47725d84@2ndquadrant.com Whole thread Raw
In response to	Re: [HACKERS] PoC: Grouped base relation (Antonin Houska <ah@cybertec.at>)
Responses	Re: [HACKERS] PoC: Grouped base relation
List	pgsql-hackers
Tree view
On 01/17/2017 08:05 PM, Antonin Houska wrote:
> [ Trying to respond to both Tomas and David. I'll check tomorrow if anything
> else of the thread needs my comment. ]
>
> Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>
>> On 01/17/2017 12:42 AM, David Rowley wrote:
>>> On 10 January 2017 at 06:56, Antonin Houska <ah@cybertec.at> wrote:
>
>>> I've been thinking about this aggtransmultifn and I'm not sure if it's
>>> really needed. Adding a whole series of new transition functions is
>>> quite a pain. At least I think so, and I have a feeling Robert might
>>> agree with me.
>>>
>>> Let's imagine some worst case (and somewhat silly) aggregate query:
>>>
>>> SELECT count(*)
>>> FROM million_row_table
>>> CROSS JOIN another_million_row_table;
>>>
>>> Today that's going to cause 1 TRILLION transitions! Performance will
>>> be terrible.
>>>
>>> If we pushed the aggregate down into one of those tables and performed
>>> a Partial Aggregate on that, then a Finalize Aggregate on that single
>>> row result (after the join), then that's 1 million transfn calls, and
>>> 1 million combinefn calls, one for each row produced by the join.
>>>
>>> If we did it your way (providing I understand your proposal correctly)
>>> there's 1 million transfn calls on one relation, then 1 million on the
>>> other and then 1 multiplyfn call. which does 1000000 * 1000000
>>>
>>> What did we save vs. using the existing aggcombinefn infrastructure
>>> which went into 9.6? Using this actually costs us 1 extra function
>>> call, right? I'd imagine the size of the patch to use aggcombinefn
>>> instead would be a fraction of the size of the one which included all
>>> the new aggmultiplyfns and pg_aggregate.h changes.
>>>
>
>> I think the patch relies on the assumption that the grouping reduces
>> cardinality,
>
> Yes.
>
>> so a CROSS JOIN without a GROUP BY clause may not be the best
>> counterexample.
>
> Yet it tells me that my approach is not ideal in some cases ...
>
>>> It sounds like a much more manageable project by using aggcombinefn
>>> instead. Then maybe one day when we can detect if a join did not cause
>>> any result duplication (i.e Unique Joins), we could finalise the
>>> aggregates on the first call, and completely skip the combine state
>>> altogether.
>>>
>
>> I don't quite see how the patch could use aggcombinefn without sacrificing a
>> lot of the benefits. Sure, it's possible to run the aggcombinefn in a loop
>> (with number of iterations determined by the group size on the other side of
>> the join), but that sounds pretty expensive and eliminates the reduction of
>> transition function calls. The join cardinality would still be reduced,
>> though.
>
> That's what I think. The generic case is that neither side of the join is
> unique. If it appears that both relations should be aggregated below the join,
> aggcombinefn would have to be called multiple times on each output row of the
> join to achieve the same result as the calling aggmultiplyfn.
>
>> I do have other question about the patch, however. It seems to rely on the
>> fact that the grouping and joins both reference the same columns. I wonder how
>> uncommon such queries are.
>>
>> To give a reasonable example, imagine the typical start schema, which is
>> pretty standard for large analytical databases. A dimension table is
>> "products" and the fact table is "sales", and the schema might look like this:
>>
>> CREATE TABLE products (
>>     id            SERIAL PRIMARY KEY,
>>     name          TEXT,
>>     category_id   INT,
>>     producer_id   INT
>> );
>>
>> CREATE TABLE sales (
>>     product_id    REFERENCES products (id),
>>     nitems        INT,
>>     price         NUMERIC
>> );
>>
>> A typical query then looks like this:
>>
>>     SELECT category_id, SUM(nitems), SUM(price)
>>     FROM products p JOIN sales s ON (p.id = s.product_id)
>>     GROUP BY p.category_id;
>>
>> which obviously uses different columns for the grouping and join, and so the
>> patch won't help with that. Of course, a query grouping by product_id would
>> allow the patch to work
>
> Right, the current version does not handle this. Thanks for suggestion.>

So you're saying it's merely a limitation of the initial patch version 
and not an inherent limitation?

>
>> Another thing is that in my experience most queries do joins on foreign keys
>> (so the PK side is unique by definition), so the benefit on practical examples
>> is likely much smaller.
>
> ok. So in some cases the David's approach might be better.
>

In which cases would David's approach be more efficient? But even if 
there are such cases, I assume we could generate both paths and decide 
based on cost, just like with all other alternative paths.
>
> However I think the ability to join 2 grouped (originally non-unique)
> relations is still important. Consider a query involving "sales" as well as
> another table which also has many-to-one relationship to "products".
>

Well, can you give a practical example? What you describe seems like a 
combination of two fact tables + a dimension, something like this:

CREATE TABLE products (    id            SERIAL PRIMARY KEY,    name          TEXT,    category_id   INT,
producer_id  INT
 
);

CREATE TABLE sales (    product_id    REFERENCES products (id),    nitems        INT,    price         NUMERIC
);

CREATE TABLE reviews (    product_id    REFERENCES products (id),    stars         INT
);

But how exactly do you join that together? Because
    SELECT * FROM products p JOIN sales s ON (p.id = s.product_id)                             JOIN reviews r ON (p.id
=r.product_id)
 

is clearly wrong - it's essentially M:N join between the two fact 
tables, increasing the number of rows.

It'd helpful to have an example of a practical query optimized by the 
patch. I'm not claiming it does not exist, but I've been unable to come 
up with something reasonable at the moment.

>> But I guess my main question is if there are actual examples of queries the
>> patch is trying to improve, or whether the general benefit is allowing
>> parallel plans for queries where it would not be possible otherwise.
>
> In fact I did all this with postgres_fdw in mind.
>

I assume there's not much difference between pushing down aggregates to 
local workers and to remote nodes. There'll be costing differences, but 
are there any other differences?
>
> From this perspective, David's approach can be slightly more efficient if all
> the tables are local, but aggregation of multiple base relations below the
> join can save a lot of effort if the tables are remote (as it reduces the
> amount of data transferred over network).
>
> I'm not terribly happy about changing the system catalog, but adding something
> like pg_aggregate(aggtransmultifn) is currently the best idea I have.
>

I personally don't see that as a major problem, my impression it can be 
mostly copied from the partial aggregate patch - it's not trivial, but 
manageable. Propagating it to the optimizer will be less trivial, but 
well, if it's necessary ...

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date:
From: Tom Lane
Date: 18 January 2017, 01:53:36
Subject: Re: [HACKERS] [COMMITTERS] pgsql: Generate fmgr prototypes automatically
From: Haribabu Kommi
Date: 18 January 2017, 03:55:47
Subject: Re: [HACKERS] Parallel Index Scans
Re: [HACKERS] PoC: Grouped base relation - Mailing list pgsql-hackers

Previous

Next