Re: percentile value check can be slow - Mailing list pgsql-hackers
From | David Fetter |
---|---|
Subject | Re: percentile value check can be slow |
Date | |
Msg-id | 20171119021009.GJ4411@fetter.org Whole thread Raw |
In response to | Re: percentile value check can be slow (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Responses |
Re: percentile value check can be slow
|
List | pgsql-hackers |
On Sat, Nov 18, 2017 at 11:05:47PM +0100, Tomas Vondra wrote: > Hi, > > On 11/18/2017 10:30 PM, David Fetter wrote: > > On Sat, Nov 18, 2017 at 10:44:36AM -0500, Tom Lane wrote: > >> jotpe <jotpe@posteo.de> writes: > >>> I tried to enter invalid percentile fractions, and was astonished > >>> that it seems to be checked after many work is done? > >> > >> IIRC, only the aggregate's final-function is concerned with direct > >> arguments, so it's not all that astonishing. > > > > It may not be surprising from the point of view of a systems > > programmer, but it's pretty surprising that this check is deferred to > > many seconds in when the system has all the information it need in > > order to establish this before execution begins. > > > > I'm not sure I see an easy way to do this check early, but it's worth > > trying on grounds of POLA violation. I have a couple of ideas on how > > to do this, one less invasive but hinky, the other a lot more invasive > > but better overall. > > > > Ugly Hack With Ugly Performance Consequences: > > Inject a subtransaction at the start of execution that supplies an > > empty input to the final function with the supplied direct > > arguments. > > I'm pretty sure you realize this is quite unlikely to get accepted. I should hope not, but I mentioned it because I'd like to get it on the record as rejected. > > Bigger Lift: > > Require a separate recognizer function direct arguments and fire > > it during post-parse analysis. Perhaps this could be called > > recognizer along with the corresponding mrecognizer. It's not > > clear that it's sane to have separate ones, but I thought I'd > > bring it up for completeness. > > > > Is 'recognizer' an established definition I should know? Is it the same > as 'validator' or is it something new/different? I borrowed it from http://langsec.org/ I'm not entirely sure what you mean by a validator, but a recognizer is something that gives a quick and sure read as to whether the input is well-formed. In general, it's along the lines of a tokenizer, a parser, and something that does very light post-parse analysis for correctness of form. For the case that started the thread, a recognizer would check something along the lines of CHECK('[0,1]' @> ALL(input_array)) > > Way Bigger Lift, As Far As I Can Tell, But More Fun For Users: > > Allow optional CHECK constraints in CREATE AGGREGATE for direct > > arguments. > > > > How will any of the approaches deal with something like > > select percentile_cont((select array_agg(v) from p)) > within group (order by a) from t; > > In this case the the values are unknown after the parse analysis, so I > guess it does not really address that. It doesn't. Does it make sense to do a one-shot execution for cases like that? It might well be worth it to do the aggregate once in advance as a throw-away if the query execution time is already going to take awhile. Of course, you can break that one by making p a JOIN to yet another thing... > FWIW while I won't stand in the way of improving this, I wonder if this > is really worth the additional complexity. If you get errors like this > with a static list of values, you will fix the list and you're done. If > the list is dynamic (computed in the query itself), you'll still get the > error much later during query execution. > > So if you're getting many failures like this for the "delayed error > reporting" to be an issue, perhaps there's something wrong in you stack > and you should address that instead? I'd like to think that getting something to fail quickly and cheaply when it can will give our end users a better experience. Here, "cheaply" refers to their computing resources and time. Clearly, not having this happen in this case bothered Johannes enough to wade in here. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
pgsql-hackers by date: