Thread: About aggregates...

About aggregates...

From
Michael Giannakopoulos
Date:
Hello guys,

I would like to ask if there is any way to make an aggregate function to take a set of tuples as an input variable. I know that an actual aggregate function receives each tuple one at a time and process it on the fly. However I want to store tuples in an incremental fashion so as to process them in a batch approach in the finalaggr function. Think for example implementing logistic regression (which is an OLAP query by its nature). I want to support it with the current features that PostgreSQL provides from which the closest feature is an aggregate. However an aggregate function feeds me one a tuple for each call, but I would like to have access to a batch of tuples per function call. Is there any possible way to perform something like this?

Thank you very much for your time,
Michael 

Re: About aggregates...

From
Ondrej Ivanič
Date:
Hi,

On 30 November 2012 08:06, Michael Giannakopoulos <miccagiann@gmail.com> wrote:
> However an aggregate function
> feeds me one a tuple for each call, but I would like to have access to a
> batch of tuples per function call. Is there any possible way to perform
> something like this?

Yes, this might be good for you::

WINDOW
WINDOW indicates that the function is a window function rather than a
plain function. This is currently only useful for functions written in
C. The WINDOW attribute cannot be changed when replacing an existing
function definition.
http://www.postgresql.org/docs/9.1/static/sql-createfunction.html

Apart from C you can use this in Pl/R:
http://www.joeconway.com/plr/doc/plr-window-funcs.html


--
Ondrej


Re: About aggregates...

From
"David Johnston"
Date:
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Michael
Giannakopoulos
Sent: Thursday, November 29, 2012 4:07 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] About aggregates...

Hello guys,

I would like to ask if there is any way to make an aggregate function to
take a set of tuples as an input variable. I know that an actual aggregate
function receives each tuple one at a time and process it on the fly.
However I want to store tuples in an incremental fashion so as to process
them in a batch approach in the finalaggr function. Think for example
implementing logistic regression (which is an OLAP query by its nature). I
want to support it with the current features that PostgreSQL provides from
which the closest feature is an aggregate. However an aggregate function
feeds me one a tuple for each call, but I would like to have access to a
batch of tuples per function call. Is there any possible way to perform
something like this?

Thank you very much for your time,
Michael 


=====================================

Not sure how the system would decide between (1-at-a-time) and
(everything-at-once).

The only approach I can think of would be to build out an array of "tuples"
and then have the aggregate process a single array value each time.

 As Ondrej indicates in parallel you can try making use of Windows (probably
with a FRAME definition) as well.

Hopefully this helps but I am not familiar enough with the use-case to be
more specific.

David J.




Re: About aggregates...

From
"Albe Laurenz"
Date:
Michael Giannakopoulos wrote:
> I would like to ask if there is any way to make an aggregate function
to take a set of tuples as an
> input variable. I know that an actual aggregate function receives each
tuple one at a time and process
> it on the fly. However I want to store tuples in an incremental
fashion so as to process them in a
> batch approach in the finalaggr function. Think for example
implementing logistic regression (which is
> an OLAP query by its nature). I want to support it with the current
features that PostgreSQL provides
> from which the closest feature is an aggregate. However an aggregate
function feeds me one a tuple for
> each call, but I would like to have access to a batch of tuples per
function call. Is there any
> possible way to perform something like this?

If you write in C, there is nothing that keeps you from
storing all the rows that come in in memory allocated in
a suitable MemoryContext and process them all at the end.

You might run out of memory though.

Yours,
Laurenz Albe