Re: [HACKERS] CONSTRAINTS... - Mailing list pgsql-hackers
From | Vadim Mikheev |
---|---|
Subject | Re: [HACKERS] CONSTRAINTS... |
Date | |
Msg-id | 369E126B.76560249@krs.ru Whole thread Raw |
In response to | Re: [HACKERS] CONSTRAINTS... (jwieck@debis.com (Jan Wieck)) |
Responses |
Re: [HACKERS] CONSTRAINTS...
|
List | pgsql-hackers |
Jan Wieck wrote: > > Generic triggers in C that are argument driven would be > possible. But the drawback is that those triggers have to be > very smart to use saved SPI plans (one for every different > argument set). And it must be row level triggers, so for an > update to a 2 meg row table they will be fired 2 million > times and run their queries inside - will take some time. > > More painful in the 2 meg row situation is that trigger > invocation has to be delayed until COMMIT if the constraint > is deferred. I think we cannot remember 2 million OLD plus 2 > million NEW tuples if one tuple can have up to 8K (will be > 32GB to remember plus overhead), so we need to remember at > least the CTID's of OLD and NEW and refetch them for the > trigger invocation. OUTCH - the OLD ones are at the head and > all the NEW ones are at the end of the tables file! (Note that now in the case of UPDATE t_ctid of OLD tuples points to TID of NEW tuples.) > > I am basically asking for a reason _not_ to use the rewrite system for > > this. I can't think of one myself. > > It might interfere with the new MVCC code. The rule actions > must see exactly the OLD tuples that where used in the > original statements. Not only those in the updated table > itself, think of an INSERT...SELECT or an UPDATE where the > TLE or qual expressions are values from other tables. Two things define data visibility: SnapShot & CommandId. We would have to save them for deffered rules and restore them before run rule actions. But there is one issue: for what scans old visibility should be used? There are scans from user query and there are scans added by rule action. Ok, let's assume that for added scans current visibility will be used - this is what we need for RI rules (actually, something more - see below). So, first task is enable different scans in (rewritten) query use different visibilities (SnapShot/CommandId pair - "snapshot", in short). We have to add new stuff to Executor and heap scan code and so I propose also new feature addition: 1. add SET SNAPSHOT snapshot_name; statement to let users define some snapshot. 2. extend query syntax to let users specify what snapshot must be used when a query table is scanned: SELECT ... FROM t1 AT SNAPSHOT s1, t2 AT SNAPSHOT s2 ...etc.. Up to now new requirement due to MVCC is taking into account not only CommandId (as already noted in last posting I got from Jan), but SnapShot too. > Not a real reason, just something to have in mind and maybe > switching silently to another MVCC isolation level if > constraint rules get applied, so all tables read from now on > will get a read lock applied and cannot get updated > concurrently until COMMIT. There is no isolevel in MVCC where locking would be used implicitly. We could use LOCK IN SHARE or SELECT FOR UPDATE (FOR SHARE LOCK ?) - using rules for RI is like implementing RI on applic level (!), - but this is bad. Fortunately, there is a way without each row/table locking if scans added by RI rule could see uncommitted chages made by concurrent xactions. Unique btree code already use special SnapshotDirty to see uncommitted changes and avoid long-term row/page locking. With this Snapshot HeapTupleSatisfies returns true if t_xmin committed and (t_xmax is invalid OR is in-progress)OR t_xmin is in-progress and t_xmax is invalid - so, caller can wait (just like the same row writers do - by locking in-progress xaction ID in transaction pseudo-table) for in-progress t_xmin/t_xmax xaction and decide what to do after concurrent xaction COMMITs/ABORTs. But before continuing with this approach I need in answer to one question. Let's consider this case: Xaction T1 with isolevel SERIALIZABLE inserts some row into child table with deffered checking of primary key existance. There were no primary key P for row inserted by T1 at the moment when T1 begun, but before T1 begins constraint checking another concurrent xaction T2 inserts P and commits. After that T1 performs checking and - what? Will be the constraint satisfied? I ask this because of all subsequent selects in T1 will not see P, but will see foreign key inserted - so, from the point of application, child --> parent relationship will be broken... Comments? Could someone run test below in Oracle? 1. In session 1: CREATE TABLE p (x integer PRIMARY KEY); CREATE TABLE c (y integer REFERENCES p); INSERT INTO p VALUES (1); 2. In session 2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE; SELECT * FROM p; -- empty INSERT INTO c VALUES (1); -- what? waits or rejects insertion? 3. In session 1: COMMIT; -- what in 1 if it was waiting? 4. In session 2: INSERT INTO c VALUES (1); -- Ok or rejected? SELECT * FROM p; SELECT * FROM c; COMMIT; TIA !!! > And it's a problem I've came across just writing this note > where MVCC already could have broken rewrite rule system > semantics. How? Vadim
pgsql-hackers by date: