WIP: Deferrable unique constraints - Mailing list pgsql-hackers
From | Dean Rasheed |
---|---|
Subject | WIP: Deferrable unique constraints |
Date | |
Msg-id | 8e2dbb700907071138y4ebe75cw81879aa513cf82a3@mail.gmail.com Whole thread Raw |
Responses |
Re: WIP: Deferrable unique constraints
Re: WIP: Deferrable unique constraints Re: WIP: Deferrable unique constraints Re: WIP: Deferrable unique constraints |
List | pgsql-hackers |
This is a feature I would find useful, and I wanted to learn more about the database internals, so I thought I would give it a go as an exercise. I added 2 new index insert modes to support deferrable. During the initial index insert, if the constraint is deferrable, it does a "partial" check of uniqueness. This is a non-blocking check which allows duplicates in, but can distinguish between definitely unique and potentially non-unique. For potential uniqueness violations a deferred trigger is queued to do a full check at the end of the statement or transaction, or when SET CONSTRAINTS is called. The trigger then replays the insert in a "fake insert" mode, which doesn't actually insert anything - it just checks that what is already there is unique, waiting for other transactions if necessary. This approach works well if the number of potential conflicts is small. For example, if uniqueness is not actually violated, there is no significant penalty for the deferred check, since no triggers are queued: pgdevel=# CREATE TABLE foo(a int UNIQUE DEFERRABLE INITIALLY DEFERRED); NOTICE: CREATE TABLE / UNIQUE will create implicit index "foo_a_key" for table "foo" CREATE TABLE Time: 79.131 ms pgdevel=# INSERT INTO foo (SELECT * FROM generate_series(0, 1999999, 2)); -- 1M rows INSERT 0 1000000 Time: 11403.906 ms pgdevel=# BEGIN; BEGIN Time: 0.145 ms pgdevel=# UPDATE foo SET a=a+1; -- Uniqueness never violated UPDATE 1000000 Time: 21258.245 ms pgdevel=# COMMIT; COMMIT Time: 83.267 ms compared with: pgdevel=# CREATE TABLE foo(a int UNIQUE); NOTICE: CREATE TABLE / UNIQUE will create implicit index "foo_a_key" for table "foo" CREATE TABLE Time: 110.277 ms pgdevel=# INSERT INTO foo (SELECT * FROM generate_series(0, 1999999, 2)); -- 1M rows INSERT 0 1000000 Time: 11196.600 ms pgdevel=# BEGIN; BEGIN Time: 0.146 ms pgdevel=# UPDATE foo SET a=a+1; -- Uniqueness never violated UPDATE 1000000 Time: 21054.430 ms pgdevel=# COMMIT; COMMIT Time: 186.174 ms However, if there are lots of temporarily non-unique values, the performance impact is much greater, since the unique check is repeated for each row at commit time: pgdevel=# CREATE TABLE foo(a int UNIQUE DEFERRABLE INITIALLY DEFERRED); NOTICE: CREATE TABLE / UNIQUE will create implicit index "foo_a_key" for table "foo" CREATE TABLE Time: 64.244 ms pgdevel=# INSERT INTO foo (SELECT * FROM generate_series(0, 1999999, 2)); -- 1M rows INSERT 0 1000000 Time: 11621.438 ms pgdevel=# BEGIN; BEGIN Time: 0.146 ms pgdevel=# UPDATE foo SET a=a+2; -- Uniqueness temporarily violated UPDATE 1000000 Time: 21807.128 ms pgdevel=# COMMIT; COMMIT Time: 14974.916 ms And similarly for an "immediate" check: pgdevel=# CREATE TABLE foo(a int UNIQUE DEFERRABLE INITIALLY IMMEDIATE); NOTICE: CREATE TABLE / UNIQUE will create implicit index "foo_a_key" for table "foo" CREATE TABLE Time: 44.477 ms pgdevel=# INSERT INTO foo (SELECT * FROM generate_series(0, 1999999, 2)); -- 1M rows INSERT 0 1000000 Time: 12083.089 ms pgdevel=# UPDATE foo SET a=a+2; UPDATE 1000000 Time: 37173.210 ms Also, as for FKs and other queued triggers, this doesn't scale because the queue is held in memory. Curing the scalability problem by spooling the queue to disk shouldn't be too hard to do, but that doesn't address the problem that if a significant proportion of rows from the table need to be checked, it is far quicker to scan the whole index once than check row by row. In fact, with this data on my machine, a single scan takes around 0.5sec, so on that basis a full scan would be quicker if more than around 3% of the data is flagged as potentially non-unique, although the planner might be a better judge of that. I can't see a way of coding something that can switch over to a full scan, without first recording all the potential violations. Then I'm not entirely sure how best to do the switch-over logic. - Dean
Attachment
pgsql-hackers by date: