Re: FWD: overlaps() bug? - Mailing list pgsql-hackers
From | PATTERSON,JEFF (A-Sonoma,ex1) |
---|---|
Subject | Re: FWD: overlaps() bug? |
Date | |
Msg-id | 08E9E30FCA2CD5119BEA0090274066DF01D31306@axcs16.cos.agilent.com Whole thread Raw |
List | pgsql-hackers |
Subject: Re: FWD: overlaps() bug? >(back on list; it is an interesting discussion imho) >> What I don't expect is for a built-in >> Boolean function to lie to me when used according to the published API! >> Violating a specification's underlying assumption is the same as violating >> the specification. One should either re-write the overlap function to >> properly handle time/timetz data points or eliminate the overlap function >> for the time data altogether. As it stands, it is broken and dangerous. >Sorry, I haven't yet made the leap from taking the spec literally (as I >think we have done) to somehow violating the spec's underlying >assumption. Clearly the spec puts TIME and TIME WITH TIME ZONE into the >same "datetime data type" category discussed in the OVERLAPS definition. I have to disagree. The datetime data points form a non-periodic, Euclidean space, extending forward and backward to what passes for forever. This is _not_ the case with time/timetz data points. They form a periodic, wrapped space which require different operators, much the same way that trigonometric functions differ from their Euclidean counterparts. >What "underlying assumption" are you referring to? ..the assumption of a Euclidean space. It is not specifically spelled out in the specification but the logic (swap inputs if end_point < start_point)is only valid for a non-wrapping space. As mentioned above, TIME and TIME WITH TIME ZONE data points are periodic and form a cylindrical, wrapped space. >I *know* that this >particular case seems to lead to non-intuitive behavior, You mean non-intuitive as in incorrect?? >and I've made >the argument before that we should violate a spec if it is sufficiently >damaged, IMHO the spec is not damaged. It just doesn't cover the type of data we are attempting to apply it to in this case. >but I'm not sure that we should make that leap here. I'm not >actually arguing against it, other than we should be inclined by default >to follow the spec. >Comments? > - Thomas Specs are a good thing and should be adhered to. We should not however blindly follow them off a cliff. If a function can not be implemented that both follows the spec and gives the right answer then IMHO the function should not be implemented. At least this way the user knows he/she has to implement thier own. The way it stands the result is the programmers worst enemy, the silent error. However, by my reading of the spec, it is silent on the correct implementation of overlap for TIME data and therefore we should be free to do the right thing. Jeff > > Note the third row in the query result below is in error. The four hour > > interval (2300UTC - 0300UTC) does not overlap the interval > 1530UTC-1627UTC). > > Is this a bug? > > No. It conforms to (my reading of) the SQL99 spec. So it is a feature, > even if I misread the spec. Which I think I didn't ;) But if I did, then > we can change the implementation of course. > > I've included the relevant part of the spec below. It seems clause (3) > requires that we reorder the arguments to OVERLAPS, though perhaps > someone would like to research whether TIME is allowed to be used with > OVERLAPS at all (if not, then we could make up the rules ourselves). > > > It would be cool if timetz (or time) datatypes were to wrap properly > > across day boundaries (i.e. if start time < stop time then assume start > time > > is day before) but at the very least, the overlaps functions should not > lie > > to you! > > Some parts of the spec aren't cool, or interfer with coolness. This may > be one of them. If everything conforms to the standard, then we can > start discussing whether that part of the standard is so brain-dead as > to be useless or likely to directly cause damage. > > But in your case, choosing to record only times but then expecting the > code to respect a day boundary seems to be an assumption which could > bite you in other ways later. What happens when an interval happens to > be longer than a day?? > > hth > > - Thomas > > (omit some text defining the input as "(D1, E1) OVERLAPS (D2, E2)" as > the input to the OVERLAPS operator) > > 3) If D1 is the null value or if E1 < D1, then let S1 = E1 and let > T1 = D1. Otherwise, let S1 = D1 and let T1 = E1. > 4) Case: > a) If the most specific type of the second field of <row value > expression 2> is a datetime data type, then let E2 be the > value of the second field of <row value expression 2>. > b) If the most specific type of the second field of <row value > expression 2> is INTERVAL, then let I2 be the value of the > second field of <row value expression 2>. Let E2 = D2 + I2. > 5) If D2 is the null value or if E2 < D2, then let S2 = E2 and let > T2 = D2. Otherwise, let S2 = D2 and let T2 = E2. > 6) The result of the <overlaps predicate> is the result of the > following expression: > ( S1 > S2 AND NOT ( S1 >= T2 AND T1 >= T2 ) ) > OR > ( S2 > S1 AND NOT ( S2 >= T1 AND T2 >= T1 ) ) > OR > > ( S1 = S2 AND ( T1 <> T2 OR T1 = T2 ) )
Attachment
pgsql-hackers by date: