Re: Infinite Interval - Mailing list pgsql-hackers
From | Dean Rasheed |
---|---|
Subject | Re: Infinite Interval |
Date | |
Msg-id | CAEZATCVor1DP=YfhphuDo66Ncn4OA_aBN0vcXOqJP49jYMW+Kw@mail.gmail.com Whole thread Raw |
In response to | Re: Infinite Interval (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>) |
Responses |
Re: Infinite Interval
|
List | pgsql-hackers |
On Wed, 4 Oct 2023 at 14:29, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > I think we should introduce interval_out_of_range_error() function on > the lines of float_overflow_error(). Later patches introduce more > places where we raise that error. We can introduce the function as > part of those patches. > I'm not convinced that it is really worth it. Note also that even with this patch, there are still more places that throw "timestamp out of range" errors than "interval out of range" errors. > > 4. I tested pg_upgrade on a table with an interval with INT_MAX > > months, and it was silently converted to infinity. I think that's > > probably the best outcome (better than failing). However, this means > > that we really should require all 3 fields of an interval to be > > INT_MIN/MAX for it to be considered infinite, otherwise it would be > > possible to have multiple internal representations of infinity that do > > not compare as equal. > > > My first patch was comparing all the three fields to determine whether > a given Interval value represents infinity. [3] changed that to use > only the month field. I guess that was based on the discussion at [4]. > You may want to review that discussion if not already done. I am fine > either way. We should be able to change the comparison code later if > we see performance getting impacted. > Before looking at the details more closely, I might have agreed with that earlier discussion. However, given that things like pg_upgrade have the possibility of turning formerly allowed, finite intervals into infinity, we really need to ensure that there is only one value equal to infinity, otherwise the results are likely to be very confusing and counter-intuitive. That means that we have to continue to regard intervals like INT32_MAX months + 10 days as finite. While I haven't done any performance testing, I wouldn't expect this to have much impact. In a 64-bit build, this actually generates 2 comparisons rather than 3 -- one comparing the combined month and day fields against a 64-bit value containing 2 copies of INT32_MAX, and one testing the time field. In practice, only the first test will be executed in the vast majority of cases. Something that perhaps does need discussing is the fact that '2147483647 months 2147483647 days 9223372036854775807 usecs' is now accepted by interval_in() and gives infinity. That's a bit ugly, but I think it's defensible as a measure to prevent dump/restore errors from older databases, and in any case, such an interval is outside the documented range of supported intervals, and is a highly contrived example, vanishingly improbable in practice. Alternatively, we could have interval_in() reject this, which would open up the possibility of dump/restore errors. It could be argued that that's OK, for similar reasons -- the failing value is highly unlikely/contrived, and out of the documented range. I don't like that though. I don't think dump/restore should fail under any circumstances, however unlikely. Another alternative is to accept this input, but emit a WARNING. I don't particularly like that either, since it's forcing a check on every input value, just to cater for this one specific highly unlikely input. In fact, both these alternative approaches (rejecting the value, or emitting a warning), would impose a small performance penalty on every interval input, which I don't think is really worth it. So overall, my preference is to just accept it. Anything else is more work, for no practical benefit. Regards, Dean
pgsql-hackers by date: