Thread: BUG #13073: Uniqueness constraint incorrectly reports constraint violations

BUG #13073: Uniqueness constraint incorrectly reports constraint violations

From

dportas@acm.org

Date:

16 April 2015, 20:15:23

The following bug has been logged on the website:

Bug reference:      13073
Logged by:          David Portas
Email address:      dportas@acm.org
PostgreSQL version: 9.1.13
Operating system:   Debian Linux
Description:

Repro script:
CREATE TABLE tbl1 (x INT NOT NULL PRIMARY KEY);
INSERT INTO tbl1 VALUES (1),(2);
UPDATE tbl1 SET x = x +1;

Result:

ERROR:  duplicate key value violates unique constraint "tbl1_pkey"
DETAIL:  Key (x)=(2) already exists.

Expected result: UPDATE should succeed because the constraint is not
violated. The constraint should be evaluated against the complete resulting
table as per documentation: "unique with respect to all the rows in the
table"[1].

The expected result can be seen if the insertion order of the INSERTs is
reversed:
CREATE TABLE tbl1 (x INT NOT NULL PRIMARY KEY);
INSERT INTO tbl1 VALUES (2),(1);
UPDATE tbl1 SET x = x +1;

Result: UPDATE succeeds. This is expected but is inconsistent with the
previous result even though the two UPDATEs are logically equivalent.

The same effect is seen if UNIQUE is specified instead of PRIMARY KEY.

[1]http://www.postgresql.org/docs/9.1/static/ddl-constraints.html

Re: BUG #13073: Uniqueness constraint incorrectly reports constraint violations

From

Guillaume Lelarge

Date:

16 April 2015, 20:20:21

Le 16 avr. 2015 10:17 PM, <dportas@acm.org> a =C3=A9crit :
>
> The following bug has been logged on the website:
>
> Bug reference:      13073
> Logged by:          David Portas
> Email address:      dportas@acm.org
> PostgreSQL version: 9.1.13
> Operating system:   Debian Linux
> Description:
>
> Repro script:
> CREATE TABLE tbl1 (x INT NOT NULL PRIMARY KEY);
> INSERT INTO tbl1 VALUES (1),(2);
> UPDATE tbl1 SET x =3D x +1;
>
> Result:
>
> ERROR:  duplicate key value violates unique constraint "tbl1_pkey"
> DETAIL:  Key (x)=3D(2) already exists.
>
> Expected result: UPDATE should succeed because the constraint is not
> violated. The constraint should be evaluated against the complete
resulting
> table as per documentation: "unique with respect to all the rows in the
> table"[1].
>
> The expected result can be seen if the insertion order of the INSERTs is
> reversed:
> CREATE TABLE tbl1 (x INT NOT NULL PRIMARY KEY);
> INSERT INTO tbl1 VALUES (2),(1);
> UPDATE tbl1 SET x =3D x +1;
>
> Result: UPDATE succeeds. This is expected but is inconsistent with the
> previous result even though the two UPDATEs are logically equivalent.
>
> The same effect is seen if UNIQUE is specified instead of PRIMARY KEY.
>
> [1]http://www.postgresql.org/docs/9.1/static/ddl-constraints.html
>

This is expected. You need deferrable constraints to make that work.

Re: BUG #13073: Uniqueness constraint incorrectly reports constraint violations

From

"David G. Johnston"

Date:

16 April 2015, 20:39:29

On Thu, Apr 16, 2015 at 1:03 PM, <dportas@acm.org> wrote:

> The following bug has been logged on the website:
>
> Bug reference:      13073
> Logged by:          David Portas
> Email address:      dportas@acm.org
> PostgreSQL version: 9.1.13
> Operating system:   Debian Linux
> Description:
>
> Repro script:
> CREATE TABLE tbl1 (x INT NOT NULL PRIMARY KEY);
> INSERT INTO tbl1 VALUES (1),(2);
> UPDATE tbl1 SET x =3D x +1;
>
> Result:
>
> ERROR:  duplicate key value violates unique constraint "tbl1_pkey"
> DETAIL:  Key (x)=3D(2) already exists.
>
> Expected result: UPDATE should succeed because the constraint is not
> violated. The constraint should be evaluated against the complete resulti=
ng
> table as per documentation: "
> =E2=80=8B=E2=80=8B
> unique with respect to all the rows in the
> table"[1].
>

=E2=80=8BAnd at the moment you update 1 to become 2 you have two rows in th=
e table
having x=3D2; even if that particular picture of the table is one that no
other statements could ever see.

=E2=80=8BYou are, not unexpectedly, assuming that constraints are evaluated=
 only
after all rows has been processed - i.e., post-statement completion.  While
this is possible (see below) it is not the default behavior.  By default,
as each row is updated all of the relevant constraints are checked to see
if any have been violated.=E2=80=8B

> The expected result can be seen if the insertion order of the INSERTs is
> reversed:
> CREATE TABLE tbl1 (x INT NOT NULL PRIMARY KEY);
> INSERT INTO tbl1 VALUES (2),(1);
> UPDATE tbl1 SET x =3D x +1;
>
> Result: UPDATE succeeds. This is expected but is inconsistent with the
> previous result even though the two UPDATEs are logically equivalent.
>
> The same effect is seen if UNIQUE is specified instead of PRIMARY KEY.
>
> [1]http://www.postgresql.org/docs/9.1/static/ddl-constraints.html

=E2=80=8BLikely the documentation could use improvement here...everything n=
ecessary
to explain this behavior is documented but seemingly inadequately
cross-referenced.

http://www.postgresql.org/docs/devel/static/sql-set-constraints.html

=E2=80=8BAs Guillaume Lelarge notes you have to cause the constraint to be
evaluated in deferred mode  Alternatively you can, I think, use a from
clause sub-select source that is ordered by (x DESC) to ensure that at no
time does the snapshot contain duplicate values for "x".  Your example
proves this works in small circumstances but I'm not positive if the
executor guarantees to update the rows in the same order as the
sub-select.  I am fairly certain that it does.

It is considerably more performant to evaluate constraints immediately -
and need to execute "UPDATE tbl SET x =3D x + 1" is infrequent...and one of
the few circumstances where this (order of row evaluation) problem arises.

David J.

Re: BUG #13073: Uniqueness constraint incorrectly reports constraint violations

From

David Portas

Date:

16 April 2015, 21:16:27

On 16 April 2015 at 21:39, David G. Johnston <david.g.johnston@gmail.com> wrote:
>
> You are, not unexpectedly, assuming that constraints are evaluated only
> after all rows has been processed - i.e., post-statement completion.  While
> this is possible (see below) it is not the default behavior.  By default, as
> each row is updated all of the relevant constraints are checked to see if
> any have been violated.
>

Thanks. It's interesting that the default behaviour is to compromise
ACID compliance with a result that is, logically speaking,
non-deterministic. This appears to be inconsistent with the ISO SQL
standard [1] and with other SQL DBMSs.

David

[1] I only have the SQL 1999 and 2003 documentation to hand. In both
cases Section 10 of the Foundation document specifies that immediate
constraint checking (whether deferrable or not) occurs "on completion
of any SQL-statement".

Re: BUG #13073: Uniqueness constraint incorrectly reports constraint violations

From

Tomas Vondra

Date:

16 April 2015, 22:37:03

On 04/16/15 23:16, David Portas wrote:
> On 16 April 2015 at 21:39, David G. Johnston <david.g.johnston@gmail.com> wrote:
>>
>> You are, not unexpectedly, assuming that constraints are evaluated only
>> after all rows has been processed - i.e., post-statement completion.  While
>> this is possible (see below) it is not the default behavior.  By default, as
>> each row is updated all of the relevant constraints are checked to see if
>> any have been violated.
>>
>
> Thanks. It's interesting that the default behaviour is to compromise
> ACID compliance with a result that is, logically speaking,
> non-deterministic. This appears to be inconsistent with the ISO SQL
> standard [1] and with other SQL DBMSs.

I don't see how this compromises ACID compliance. If anything, it makes
the consistency checks more strict (not allowing violated constraint
mid-transaction).

As for the SQL standard compliance, the documentation [1] says this:

     When a UNIQUE or PRIMARY KEY constraint is not deferrable,
     PostgreSQL checks for uniqueness immediately whenever a row is
     inserted or modified. The SQL standard says that uniqueness should
     be enforced only at the end of the statement; this makes a
     difference when, for example, a single command updates multiple key
     values. To obtain standard-compliant behavior, declare the
     constraint as DEFERRABLE but not deferred (i.e., INITIALLY
     IMMEDIATE). Be aware that this can be significantly slower than
     immediate uniqueness checking.

In other words, this is a known difference, this default behavior was
chosen because

  (a) it has performance benefits
  (b) is more appropriate for most cases
  (c) does *not* compromise any consistency guarantees (but may cause
      false positives), and
  (d) there's a way to make it standard-compliant behavior by setting
      the constraint DEFERRABLE.


[1] http://www.postgresql.org/docs/9.1/static/sql-createtable.html

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #13073: Uniqueness constraint incorrectly reports constraint violations

From

"David G. Johnston"

Date:

16 April 2015, 22:46:29

On Thu, Apr 16, 2015 at 2:16 PM, David Portas <dportas@acm.org> wrote:

> On 16 April 2015 at 21:39, David G. Johnston <david.g.johnston@gmail.com>
> wrote:
> >
> > You are, not unexpectedly, assuming that constraints are evaluated only
> > after all rows has been processed - i.e., post-statement completion.
> While
> > this is possible (see below) it is not the default behavior.  By
> default, as
> > each row is updated all of the relevant constraints are checked to see =
if
> > any have been violated.
> >
>
> Thanks. It's interesting that the default behaviour is to compromise
> ACID compliance with a result that is, logically speaking,
> non-deterministic. This appears to be inconsistent with the ISO SQL
> standard [1] and with other SQL DBMSs.
>
> David
>
> [1] I only have the SQL 1999 and 2003 documentation to hand. In both
> cases Section 10 of the Foundation document specifies that immediate
> constraint checking (whether deferrable or not) occurs "on completion
> of any SQL-statement".
>

=E2=80=8BThis needs to be corrected in the documentation:

http://www.postgresql.org/docs/9.4/static/sql-set-constraints.html

beginning of page: IMMEDIATE constraints are checked at the end of each
statement.
[...]
end of page: =E2=80=8BAlso,PostgreSQL checks non-deferrable uniqueness cons=
traints
immediately, not at end of statement as the standard would suggest.

As is the case with transaction isolation a table summarizing the possible
combinations and resultant check timing would probably be quite useful.  I
am unsure whether deferrable, but not deferred, immediate checks are done
are statement end or for each record - the qualification at the end only
speaks to "non-deferrable" ones.  Regardless, the cavet seems important
enough to make in the main body and not leave solely relegated to a
compatibility note.

You are correct as to the standard non-conformance.  My understanding is
that the performance gains outweighed the conformity loss - and/or that
changing it hasn't met the level of need necessary to introduce a
regression in existing code.

However, it does not compromise ACID compliance.  It is simply not as
lenient as it could be.  If the statement executes to completion it will
have all ACID properties otherwise it will fail and the previously ACID
compliant result will remain.=E2=80=8B

David J.