Re: Window functions patch v04 for the September commit fest - Mailing list pgsql-hackers
From | David Fetter |
---|---|
Subject | Re: Window functions patch v04 for the September commit fest |
Date | |
Msg-id | 20080901203903.GP3717@fetter.org Whole thread Raw |
In response to | Re: Window functions patch v04 for the September commit fest (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>) |
Responses |
Re: Window functions patch v04 for the September commit
fest
|
List | pgsql-hackers |
On Mon, Sep 01, 2008 at 09:00:47PM +0300, Heikki Linnakangas wrote: > Hitoshi Harada wrote: >> 2008/8/30 Hitoshi Harada <umi.tanuki@gmail.com>: >>> Here's the latest window functions patch against HEAD. It seems to be >>> ready for the September commit fest, as added documents, WINDOW clause >>> feature and misc tests. I guess this would be the window functions >>> feature freeze for 8.4. The remaining feature will be implemented for >>> the later release. >>> >>> This patch consists of: >>> - windowed aggregates >>> - cooperate with GROUP BY aggregates >>> - some optimizations with multiple windows >>> - ranking functions >>> - WINDOW clause >>> - windowing SQL regression tests >>> - sgml documents >>> >>> Looking up the total road map, the dropped features are: >>> >>> - sliding window (window framing) >>> - lead(), lag(), etc. that reach for random rows >>> - user defined window functions > > Ok, I'm starting to read up on SQL2003 window functions, Maybe it would be better to read the SQL2008 standard <http://wiscorp.com/sql200n.zip> :) > and to review this patch. Here's a long brain-dump of my first > thoughts on the design. > > Let's list a couple of requirements first: > > 1. It's important that what gets committed now can be extended to > handle all of the window function stuff in SQL2003 in the future, > as well as user-defined-window-functions in the spirit of > PostgreSQL extensibility. Even if we don't implement all of it in > this release. > 2. Performance needs to be reasonable, in the face a large number of > input rows. These are typically used in OLAP queries that process > gazillions of rows. Some window functions need access to all rows in the > window frame or partition, so you can't reasonably expect great > performance with those if the working set is large, but those that > don't, should work with finite memory requirements, and reasonably fast. > > Because of 1., let's focus on the interface for > user-defined-window-functions first. Ideally, the interface is such > that all the window functions and aggregates defined in SQL2003, > and others we might want to have in core or as pgfoundry projects, > can be implemented using it. SQL2008 defines the following: <window function> ::= <window function type> OVER <window name or specification> <window function type> ::= <rank function type> <left paren> <right paren> | ROW_NUMBER <left paren> <right paren> | <aggregate function> | <ntile function> | <lead or lag function> | <first or last value function> | <nthvalue function> <rank function type> ::= RANK | DENSE_RANK | PERCENT_RANK | CUME_DIST <ntile function> ::= NTILE <left paren> <number of tiles> <right paren> <number of tiles> ::= <simple value specification> | <dynamic parameter specification> <lead or lag function> ::= <lead or lag> <left paren> <lead or lag extent> [ <comma> <offset> [ <comma> <defaultexpression> ] ] <right paren> [ <null treatment> ] <lead or lag> ::= LEAD | LAG <lead or lag extent> ::= <value expression> <offset> ::= <exact numeric literal> <default expression> ::= <value expression> <null treatment> ::= RESPECT NULLS | IGNORE NULLS <first or last value function> ::= <first or last value> <left paren> <value expression> <right paren> [ <null treatment>] <first or last value> ::= FIRST_VALUE | LAST_VALUE <nth value function> ::= NTH_VALUE <left paren> <value expression> <comma> <nth row> <right paren> [ <from firstor last> ] [ <null treatment> ] <nth row> ::= <simple value specification> | <dynamic parameter specification> <from first or last> ::= FROM FIRST | FROM LAST <window name or specification> ::= <window name> | <in-line window specification> <in-line window specification> ::= <window specification> > For functions like LEAD or CUME_DIST that don't immediately start > returning values, the executor will need a tuplestore to buffer input > rows until it gets their values from the window function. For these aggregates, should there be an API that signals that a tuplestore will be needed? > The syntax for creating a ranking aggregate might look something like this: > > CREATE RANKING AGGREGATE name ( > SFUNC = sfunc, > STYPE = state_data_type, > OUTFUNC = outfunc > ) > > This is very similar to Simon's proposal, and to current CREATE > AGGREGATE, but tweaked a bit to fix the problems Hitoshi pointed > out. > > sfunc is called once for each input row. Unlike with normal > aggregates, sfunc is passed the whole input row, so that e.g RANK > can compare it against the previous row, or LEAD can buffer it. > > outfunc is a set-returning function, and it is called until it > returns no more rows, after each sfunc invocation. > Window aggregates > ----------------- > > Like normal aggregates, window aggregates process N input values, and > return one output value. In normal GROUP BY expressions, the aggregate > function is called once each group, but in a window aggregate expression > with a window frame (aka sliding window), there is as many "groups" as > there is rows. > > In fact, any normal aggregate function can also be used as a window > aggregate and vice versa. The trivial implementation is to have a buffer > to hold all values currently in the window frame, and for each input > row, invoke the aggregate function over all the rows currently in the > frame. I think we'll need support that, to allow using any built-in or > user-defined aggregate as a window aggregate. > > However, we also want to provide optimized versions of the common > aggregates. Aggregates like COUNT, SUM, and AVG can easily be > implemented more efficiently, without rescanning all the tuples in the > group. > > I propose an extension to the current CREATE AGGREGATE syntax to allow > for optimized versions. Currently, the syntax is like: > > CREATE AGGREGATE name ( input_data_type [ , ... ] ) ( > SFUNC = sfunc, > STYPE = state_data_type > [ , FINALFUNC = ffunc ] > [ , INITCOND = initial_condition ] > [ , SORTOP = sort_operator ] > ) > > Let's add a function to allow removing tuples from the current window frame: > > CREATE AGGREGATE name ( input_data_type [ , ... ] ) ( > SFUNC = sfunc, > STYPE = state_data_type > [ , FINALFUNC = ffunc ] > [ , INITCOND = initial_condition ] > [ , SORTOP = sort_operator ] > [ , SUBTRACTFUNC = subfunc, > ) > > subfunc is like sfunc, except that the input value is "subtracted" from > the current value. On each input row, the executor calls sfunc for all > new tuples that enter the window frame, and subfunc for all tuples that > exit it. Another difference is that finalfunc can be called many times > during the lifecycle of an aggregate invocation. For example, the > subfunc for COUNT would decrement the row count by one, and for SUM, > subfunc would subtract the removed value from the current sum. Perhaps > we should also support an "opportunistic subfunc", that could either > perform the subtraction, or return a special value indicating that it > can't be done, and we need to calculate the aggregate from scratch as if > subfunc wasn't defined. That would be useful for MIN/MAX, which can be > implemented by just keeping track of the current MIN/MAX value, and > "subtracting" any other value than the current MIN/MAX is a no-op, but > "subtracting" the current MIN/MAX value requires scanning all the rest > of the tuples from scratch. > > PS. Have you looked at the "hypothetical set functions" in SQL2003? They're kinda neat :) > PPS. I was just about to send this, when I noticed that LEAD/LAG are in > fact not in the SQL2003 draft I have at hand. In fact, none of the > window functions defined there, RANK, DENSE_RANK, PERCENT_RANK, > CUME_DIST and ROW_NUMBER, require random access to the tuples. Still > seems like a good idea to be have the flexibility for them. They're in SQL:2008. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
pgsql-hackers by date: