Re: TODO items for window functions - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: TODO items for window functions |
Date | |
Msg-id | 9399.1230572147@sss.pgh.pa.us Whole thread Raw |
In response to | TODO items for window functions (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: TODO items for window functions
|
List | pgsql-hackers |
I wrote: > * Investigate whether we should prohibit window functions in recursive > terms; check whether any of the committed prohibitions are unnecessary. I looked into these questions a bit. As for the first, there doesn't appear to be a compelling implementation reason to forbid it, and I can't find anything in the spec that says to disallow it. SQL:2008's prohibition on aggregates in recursive terms is in 7.13 <query expression> syntax rule 2) g) iii) 4), and there's nothing about window functions in the vicinity. The primary reason to forbid aggregates, so far as I can divine the intent of the SQL committee, is that incremental evaluation of an aggregate would give implementation-dependent results, ie you'd get a different aggregate result depending on how many and which rows the implementation chose to push through the recursion at a time. It seems like the same charge could be leveled against window functions. On the other hand it's at least possible to construct recursive queries in which all the rows of a given window partition should get pushed through together, so that you should get consistent answers despite the overall incremental evaluation. So I can't tell for sure if the committee thought about that and intentionally decided to allow window functions in recursive terms, or if their failure to forbid it was an oversight. (My confidence in the unerring accuracy of the spec is not high at the moment ;-).) Nonetheless, the spec does not forbid it, so I feel we shouldn't either. As for the second point, I looked at every place that the committed patch throws an error for queries or expressions containing window functions. Most are demonstrably per spec, or are necessary implementation restrictions arising from the fact that we know an expression isn't going to get fed through the full planner/executor machinery (an example of the latter is ALTER COLUMN TYPE USING). The only case that I think is debatable is that we are throwing error for window functions used in a SELECT FOR UPDATE/FOR SHARE query. The corresponding error for aggregate functions is necessary because the executor top level doesn't "see" the individual rows that went into the aggregate, so there's no way to lock them. In the case of window functions no aggregation occurs, and so in principle we could lock the rows. However, consider something like this: select x, lead(x) over() from table for update limit 1; Because of the LIMIT, we'd only lock the first-returned row ... but the values returned would also depend on the second row of the table, which wouldn't get locked. In general the results could depend on any or all rows of the table but we might lock only some. This seems to me to be at variance with how you'd expect SELECT FOR UPDATE to behave, so I'm inclined to leave the prohibition in there --- at least until someone comes up with a convincing use-case for SELECT FOR UPDATE together with a window function, and explains why he doesn't care about relevant rows possibly not getting locked. Comments? regards, tom lane
pgsql-hackers by date: