Home > mailing lists

Re: Combining Aggregates - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: Combining Aggregates
Date	March 17, 2016 01:25:17
Msg-id	00ab52f0-5cd9-a667-5c3d-b0917aef7677@2ndquadrant.com Whole thread Raw
In response to	Re: Combining Aggregates (David Rowley <david.rowley@2ndquadrant.com>)
Responses	Re: Combining Aggregates
List	pgsql-hackers

Tree view

Hi,

On 03/16/2016 12:03 PM, David Rowley wrote:
> On 16 March 2016 at 10:34, David Rowley <david.rowley@2ndquadrant.com> wrote:
>> If Haribabu's patch does all that's required for the numerical
>> aggregates for floating point types then the status of covered
>> aggregates is (in order of pg_aggregate.h):
>>
>> * AVG() complete coverage
>> * SUM() complete coverage
>> * MAX() complete coverage
>> * MIN() complete coverage
>> * COUNT() complete coverage
>> * STDDEV + friends complete coverage
>> * regr_*,covar_pop,covar_samp,corr not touched these.
>> * bool*() complete coverage
>> * bitwise aggs. complete coverage
>> * Remaining are not touched. I see diminishing returns with making
>> these parallel for now. I think I might not be worth pushing myself
>> any harder to make these ones work.
>>
>> Does what I have done + floating point aggs from Haribabu seem
>> reasonable for 9.6?
>
> I've attached a series of patches.

Thanks. Haven't really done much testing of this, aside from reading 
through the patches and "it compiles".
>
> Patch 1:
> This is the parallel aggregate patch, not intended for review here.
> However, all further patches are based on this, and this adds the
> required planner changes to make it possible to test patches 2 and 3.
>
> Patch 2:
> This adds the serial/deserial aggregate infrastructure, pg_dump
> support, CREATE AGGREGATE changes, and nodeAgg.c changes to have it
> serialise and deserialise aggregate states when instructed to do so.
>
> Patch 3:
> This adds a boat load of serial/deserial functions, and combine
> functions for most of the built-in numerical aggregate functions. It
> also contains some regression tests which should really be in patch 2,
> but I with patch 2 there's no suitable serialisation or
> de-serialisation functions to test CREATE AGGREGATE with. I think
> having them here is ok, as patch 2 is quite useless without patch 3
> anyway.

I don't see how you could move the tests into #2 when the functions are 
defined in #3? IMHO this is the right place for the regression tests.

>
> Another thing to note about this patch is that I've gone and created
> serial/de-serial functions for when PolyNumAggState both require
> sumX2, and don't require sumX2. I had thought about perhaps putting an
> extra byte in the serial format to indicate if a sumX2 is included,
> but I ended up not doing it this way. I don't really want these serial
> formats getting too complex as we might like to do fun things like
> pass them along to sharded servers one day, so it might be nice to
> keep them simple.

Hmmm, I've noticed that while eyeballing the diffs, and I'm not sure if 
it's worth the additional complexity at this  point. I mean, one byte is 
hardly going to make a measurable difference - we're probably wasting 
more than that due to alignment, for example.

As you've mentioned sharded servers - how stable should the serialized 
format be? I've assumed it to be transient, i.e. in the extreme case it 
might change after restarting a database - for the parallel aggregate 
that's clearly sufficient.

But if we expect this to eventually work in a sharded environment, 
that's going to be way more complicated. I guess in these cases we could 
rely on implicit format versioning, via the major the version (doubt 
we'd tweak the format in a minor version anyway).

I'm not sure the sharding is particularly useful argument at this point, 
considering we don't really know if the current format is actually a 
good match for that.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

From: Haribabu Kommi
Date: 17 March 2016, 01:13:16
Subject: Re: pg_hba_lookup function to get all matching pg_hba.conf entries

From: Tatsuo Ishii
Date: 17 March 2016, 01:38:37
Subject: Re: [BUGS] pgbench -C -M prepared gives an error

Re: Combining Aggregates - Mailing list pgsql-hackers

Previous

Next