Thread: challenging constraint situation - how do I make it real in SQL?

challenging constraint situation - how do I make it real in SQL?

From

"Harald Armin Massa"

Date:

24 May 2006, 04:37:10

I have a business rule which gives me rahter big challenges to "melt in SQL":

At one "point in time" only one version may be active.

Simplified:

CREATE TABLE sample
(
  id_field int4,
  value text,
  validfrom timestamp,
  validto timestamp
)
within one timespan, every id_field has to be unique.

so, this is a set of legal data:

1, 'fire', -infinity                              , 2005-09-01 21:02:15.078
1, 'water', 2005-09-01 21:02:15.078, infinity


this is an illegal set of data:

1, 'fire', -infinity                              , 2005-09-01 21:02:15.078
1, 'water', 2005-05-01 12:15:15.078, infinity

because between 2005-05-01 12:15:15.078 and 2005-09-01 21:02:15.078
there are 2 values for id_field 1

 As a first measure I have made a unique key on (id_field ,
validfrom), because the "timespans"  are created by update rules of a
view, so that I am quite sure that there are no overlapping ... as
long as nobody touches the table by himself.

But it is not really fitting; and manual editing of the table can
disturb it. So, ist there some constraint creatable to make sure that
"in each point of time" each id only exists once?

Harald


--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Reinsburgstraße 202b
70197 Stuttgart
0173/9409607
-
PostgreSQL - supported by a community that does not put you on hold

Re: challenging constraint situation - how do I make it

From

Alban Hertroys

Date:

24 May 2006, 05:53:00

Harald Armin Massa wrote:
> I have a business rule which gives me rahter big challenges to "melt in
> SQL":

> But it is not really fitting; and manual editing of the table can
> disturb it. So, ist there some constraint creatable to make sure that
> "in each point of time" each id only exists once?

You probably need to write a constraint trigger, using OVERLAPS and
EXISTS operators. Writing one is rather trivial, but depends on your
exact business rules.

 From your example it is not clear to me if you can have, for example,
two records with different id's within the same or overlapping timespans.

When encountering this problem I usually wonder why there isn't a data
type that can store a timestamp and can be used to create a UNIQUE INDEX
over it's values. That'd be wonderful.
Well, maybe one day I'll actually have time to create one...

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //

Re: challenging constraint situation - how do I make it

From

Kenneth Downs

Date:

24 May 2006, 08:00:25

Alban Hertroys wrote:

>
> When encountering this problem I usually wonder why there isn't a data
> type that can store a timestamp and can be used to create a UNIQUE
> INDEX over it's values. That'd be wonderful.
> Well, maybe one day I'll actually have time to create one...
>
I tried this at trigger level.  The real bear is in the fact that there
are two columns, not one.  It is trivial to write an exclusion
constraint that disallows overlapping (including nested) values.  What
was hard was determining the meta-data structure, how do you have two
columns that are sometimes treated as one and sometimes as two?

Possible, but devilish in the details.

Attachment

ken.vcf

Re: challenging constraint situation - how do I make it

From

Alban Hertroys

Date:

24 May 2006, 09:22:48

Kenneth Downs wrote:
> Alban Hertroys wrote:
>
>> When encountering this problem I usually wonder why there isn't a data
>> type that can store a timestamp and can be used to create a UNIQUE
>> INDEX over it's values. That'd be wonderful.
>> Well, maybe one day I'll actually have time to create one...
>>
> I tried this at trigger level.  The real bear is in the fact that there
> are two columns, not one.  It is trivial to write an exclusion
> constraint that disallows overlapping (including nested) values.  What
> was hard was determining the meta-data structure, how do you have two
> columns that are sometimes treated as one and sometimes as two?

Are you refering to a 'timespan' data type that can be determined to be
unique?

I can see some problems there, as both value and range matter; it'd be
similar to determining the uniqueness of an area in a rectangle (though
1 dimensional only, of course).

I've never really dug into this, so I don't know what possibilities
PostgreSQL offers in this field. Basing this on faith :)

Regards,
--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //

Re: challenging constraint situation - how do I make it

From

Kenneth Downs

Date:

24 May 2006, 09:35:39

Alban Hertroys wrote:

> Kenneth Downs wrote:
>
>> Alban Hertroys wrote:
>>
>>> When encountering this problem I usually wonder why there isn't a
>>> data type that can store a timestamp and can be used to create a
>>> UNIQUE INDEX over it's values. That'd be wonderful.
>>> Well, maybe one day I'll actually have time to create one...
>>>
>> I tried this at trigger level.  The real bear is in the fact that
>> there are two columns, not one.  It is trivial to write an exclusion
>> constraint that disallows overlapping (including nested) values.
>> What was hard was determining the meta-data structure, how do you
>> have two columns that are sometimes treated as one and sometimes as two?
>
>
> Are you refering to a 'timespan' data type that can be determined to
> be unique?
>
> I can see some problems there, as both value and range matter; it'd be
> similar to determining the uniqueness of an area in a rectangle
> (though 1 dimensional only, of course).
>
> I've never really dug into this, so I don't know what possibilities
> PostgreSQL offers in this field. Basing this on faith :)
>
Yes.

I use a heavily dictionary-based toolset.  I write out database specs in
CSS-like syntax and it diff's and builds the databases and writes all
triggers, indexes and so forth.

The approach I tried was to have a "range" or "interval" type.  You
place a column into a table named "resv_date" or whatever and it would
expand the definition into two columns, you'd get resv_date_beg and
resv_date_end.  If you declared the "resv_date" column a primary key
column, it would build trigger code to detect overlaps and nesting and
reject those.

As I said, defining behavior and implementing it was not hard.  I even
had foreign keys into ranges that were "smart".  If the foreign key was
a single column instead of two, it would satisfy RI if the single value
was between the interval values in the parent table.

The problem comes from the split-personality of the "resv_date" column.
Sometimes its one column, sometimes its two.  This made writing the
tools nasty and difficult, and I scratched it and (gasp!) did some
validation in client code.

I have it in mind to restore the feature, but in a different way.  The
two columns should be defined separately, not as one, and then the
second of the two gets a flag setting, like:

column range_beg { primary_key: Y; }
column range_end { primary_key: Y; range_from: range_beg; }

The "range_from" setting ties one column to the other and should give me
all the behavior I had without all of the confusion.  It would have
three effects:

1)  Force range_end >= range_beg
2)  Convert the primary key into overlap/nest exclusion
3)  Allow a single column foreign key in another table to "know" that it
should do a within match instead of an equality match

Attachment

ken.vcf

Re: challenging constraint situation - how do I make it real in SQL?

From

"John D. Burger"

Date:

24 May 2006, 10:29:35

> I have a business rule which gives me rahter big challenges to "melt
> in SQL":
>
> At one "point in time" only one version may be active.

I have no specific solution, but I highly recommend Richard Snodgrass's
book, Developing Time-Oriented Database Applications in SQL, which
discusses this kind of situation, and many other similar ones.  It's
out of print, but available from his homepage as a PDF:

   http://www.cs.arizona.edu/~rts/

- John D. Burger
   MITRE

Re: challenging constraint situation - how do I make it

From

"Florian G. Pflug"

Date:

24 May 2006, 10:47:28

Kenneth Downs wrote:
> Alban Hertroys wrote:
>
>> Kenneth Downs wrote:
>>
>>> Alban Hertroys wrote:
>>>
>>>> When encountering this problem I usually wonder why there isn't a
>>>> data type that can store a timestamp and can be used to create a
>>>> UNIQUE INDEX over it's values. That'd be wonderful.
>>>> Well, maybe one day I'll actually have time to create one...
>>>>
>>> I tried this at trigger level.  The real bear is in the fact that
>>> there are two columns, not one.  It is trivial to write an exclusion
>>> constraint that disallows overlapping (including nested) values.
>>> What was hard was determining the meta-data structure, how do you
>>> have two columns that are sometimes treated as one and sometimes as two?
>>
>>
>> Are you refering to a 'timespan' data type that can be determined to
>> be unique?
>>
>> I can see some problems there, as both value and range matter; it'd be
>> similar to determining the uniqueness of an area in a rectangle
>> (though 1 dimensional only, of course).
>>
>> I've never really dug into this, so I don't know what possibilities
>> PostgreSQL offers in this field. Basing this on faith :)
>>
> Yes.
> I use a heavily dictionary-based toolset.  I write out database specs in
> CSS-like syntax and it diff's and builds the databases and writes all
> triggers, indexes and so forth.
>
> The approach I tried was to have a "range" or "interval" type.  You
> place a column into a table named "resv_date" or whatever and it would
> expand the definition into two columns, you'd get resv_date_beg and
> resv_date_end.  If you declared the "resv_date" column a primary key
> column, it would build trigger code to detect overlaps and nesting and
> reject those.
>
> As I said, defining behavior and implementing it was not hard.  I even
> had foreign keys into ranges that were "smart".  If the foreign key was
> a single column instead of two, it would satisfy RI if the single value
> was between the interval values in the parent table.
>
> The problem comes from the split-personality of the "resv_date" column.
> Sometimes its one column, sometimes its two.  This made writing the
> tools nasty and difficult, and I scratched it and (gasp!) did some
> validation in client code.
You could define a new datatype containing two timestamps. From 8.1
onwards you can use such a composite type as a field, I believe (but I
haven't checked it).

You can use pgsql functions to define operators for your type - but
you'll have to use the default input/output functions for composite
types I guess, because those can only be codec in C AFAIK.

greetings, Florian Pflug

Re: challenging constraint situation - how do I make it

From

Alban Hertroys

Date:

24 May 2006, 11:51:05

Florian G. Pflug wrote:
> You could define a new datatype containing two timestamps. From 8.1
> onwards you can use such a composite type as a field, I believe (but I
> haven't checked it).
>
> You can use pgsql functions to define operators for your type - but
> you'll have to use the default input/output functions for composite
> types I guess, because those can only be codec in C AFAIK.

That's what I was refering to. Apparently I didn't make myself very clear.

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //

Re: challenging constraint situation - how do I make it

From

Alban Hertroys

Date:

24 May 2006, 11:55:56

Kenneth Downs wrote:
> Alban Hertroys wrote:
> The approach I tried was to have a "range" or "interval" type.  You
> place a column into a table named "resv_date" or whatever and it would
> expand the definition into two columns, you'd get resv_date_beg and
> resv_date_end.  If you declared the "resv_date" column a primary key
> column, it would build trigger code to detect overlaps and nesting and
> reject those.
>
> As I said, defining behavior and implementing it was not hard.  I even
> had foreign keys into ranges that were "smart".  If the foreign key was
> a single column instead of two, it would satisfy RI if the single value
> was between the interval values in the parent table.

Been there, done that ;)

> The problem comes from the split-personality of the "resv_date" column.
> Sometimes its one column, sometimes its two.  This made writing the
> tools nasty and difficult, and I scratched it and (gasp!) did some
> validation in client code.

What's the benefit of allowing it to be only one column?

> I have it in mind to restore the feature, but in a different way.  The
> two columns should be defined separately, not as one, and then the
> second of the two gets a flag setting, like:
>
> column range_beg { primary_key: Y; }
> column range_end { primary_key: Y; range_from: range_beg; }
>
> The "range_from" setting ties one column to the other and should give me
> all the behavior I had without all of the confusion.  It would have
> three effects:
>
> 1)  Force range_end >= range_beg
> 2)  Convert the primary key into overlap/nest exclusion
> 3)  Allow a single column foreign key in another table to "know" that it
> should do a within match instead of an equality match

And making that one column of a composite type would be just the thing,
I thought somewhere at the start of this thread (Thanks for mentioning
"composite types", Florian, couldn't remember what they're called).

Cheers,

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //

Re: challenging constraint situation - how do I make it

From

Tom Lane

Date:

24 May 2006, 15:55:01

Alban Hertroys <alban@magproductions.nl> writes:
> Are you refering to a 'timespan' data type that can be determined to be
> unique?

> I can see some problems there, as both value and range matter; it'd be
> similar to determining the uniqueness of an area in a rectangle (though
> 1 dimensional only, of course).

Except that what you want is to forbid overlap, not forbid equality.
This is not possible with btree-based unique indexes, because btree
will not like an opclass whose "equality" member is not transitive.

If you look in the archives you'll find previous discussion of this
point, including some blue-sky speculation about whether GiST indexes
could be made to serve the purpose.  That's a long way from being a
solution you could use today, though.

            regards, tom lane

Vector type (Re: challenging constraint situation - how do I make it)

From

Alban Hertroys

Date:

26 May 2006, 05:55:11

Tom Lane wrote:
> Alban Hertroys <alban@magproductions.nl> writes:
>>I can see some problems there, as both value and range matter; it'd be
>>similar to determining the uniqueness of an area in a rectangle (though
>>1 dimensional only, of course).
>
> Except that what you want is to forbid overlap, not forbid equality.
> This is not possible with btree-based unique indexes, because btree
> will not like an opclass whose "equality" member is not transitive.

With what I have in mind, both overlap and equality would violate the
unique constraint. I don't quite see why someone'd want to forbid
overlap but to allow equality; isn't not allowing equality the whole
point of a unique constraint?

Yesterday being a holiday, I had some time to look into this. I'm now in
the design phase for a (more general) vector type.
A timespan type is just a special case for timestamps, the vector type
could expand that to integers, reals, points, dates, etc. Anything
scalar, basically, with some distinctions in operators depending on the
type being discrete or not (discrete types could have adjacency
operators, for example).

I haven't looked into what types of index would be suitable, I suppose
that could cause difficulties. OTOH, if neither overlap nor equality are
allowed on a unique index, I'm under the impression that a btree should
just work.

Knowing that I rarely have time to spend on things like this, is here an
appropriate place to discuss design issues or recruiting people for help?

> If you look in the archives you'll find previous discussion of this
> point, including some blue-sky speculation about whether GiST indexes
> could be made to serve the purpose.  That's a long way from being a
> solution you could use today, though.

Looks like a good starting place, I should have a look. Any pointers as
to what to look for?

Regards,
--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //

Re: Vector type (Re: challenging constraint situation - how do I make it)

From

Tom Lane

Date:

26 May 2006, 10:21:02

Alban Hertroys <alban@magproductions.nl> writes:
> Tom Lane wrote:
>> Except that what you want is to forbid overlap, not forbid equality.
>> This is not possible with btree-based unique indexes, because btree
>> will not like an opclass whose "equality" member is not transitive.

> With what I have in mind, both overlap and equality would violate the
> unique constraint. I don't quite see why someone'd want to forbid
> overlap but to allow equality; isn't not allowing equality the whole
> point of a unique constraint?

You're missing the point.  Letting "~" represent the operator that
tests for interval-overlap, we can have
    A    --------------
    B           ------------------
    C                       ----------------
so that A ~ B and B ~ C but not A ~ C.  This is too much unlike normal
equality for a btree to work with "~" as the "equality" operator.

            regards, tom lane

Re: Vector type (Re: challenging constraint situation -

From

Alban Hertroys

Date:

26 May 2006, 11:04:37

Tom Lane wrote:
> Alban Hertroys <alban@magproductions.nl> writes:
>>With what I have in mind, both overlap and equality would violate the
>>unique constraint. I don't quite see why someone'd want to forbid
>>overlap but to allow equality; isn't not allowing equality the whole
>>point of a unique constraint?
>
> You're missing the point.  Letting "~" represent the operator that
> tests for interval-overlap, we can have
>     A    --------------
>     B           ------------------

I'd say "unique constraint violation" right here (provided there's a
unique constraint on this column, of course). The order in which these
are inserted/updated doesn't seem to matter either. I'm afraid I'm still
missing the point... or maybe I'm not wrong???

>     C                       ----------------
> so that A ~ B and B ~ C but not A ~ C.  This is too much unlike normal
> equality for a btree to work with "~" as the "equality" operator.

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //

Re: Vector type (Re: challenging constraint situation - how do I make it)

From

Tom Lane

Date:

26 May 2006, 11:08:10

Alban Hertroys <alban@magproductions.nl> writes:
> I'd say "unique constraint violation" right here (provided there's a
> unique constraint on this column, of course). The order in which these
> are inserted/updated doesn't seem to matter either. I'm afraid I'm still
> missing the point...

The point is that btree depends on a number of assumptions about the
behavior of the operators in an operator class, and one of those
assumptions is that "equality" is transitive.

            regards, tom lane

Re: Vector type (Re: challenging constraint situation -

From

Alban Hertroys

Date:

26 May 2006, 13:00:04

Tom Lane wrote:
> Alban Hertroys <alban@magproductions.nl> writes:
>
>>I'd say "unique constraint violation" right here (provided there's a
>>unique constraint on this column, of course). The order in which these
>>are inserted/updated doesn't seem to matter either. I'm afraid I'm still
>>missing the point...
>
> The point is that btree depends on a number of assumptions about the
> behavior of the operators in an operator class, and one of those
> assumptions is that "equality" is transitive.

So, referring back to your example you mean that though B overlaps A,
and C overlaps B, C doesn't overlap A and thus they're not transitive.
I see.

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //