Re: incoherent view of serializable transactions - Mailing list pgsql-hackers
From | Kevin Grittner |
---|---|
Subject | Re: incoherent view of serializable transactions |
Date | |
Msg-id | 495A05C1.EE98.0025.0@wicourts.gov Whole thread Raw |
In response to | Re: incoherent view of serializable transactions (Peter Eisentraut <peter_e@gmx.net>) |
Responses |
Re: incoherent view of serializable transactions
|
List | pgsql-hackers |
>>> Peter Eisentraut <peter_e@gmx.net> wrote: > Greg Stark wrote: >> And I don't see why you discard "visibility" as unimportant. All >> the transaction isolations are defined in terms of the results if >> the transactions. Those results include both the database state and >> the data returned by the queries. Otherwise "phantom read" is a >> meaningless concept. > > Basically, if he wants to make a rigid argument that some scenario > violates the serializability promise, then it is necessary to prove: > > (1) There is no serial schedule for the set of transactions that > achieves the same outcome. > > - or - > > (2) A phantom read situation occurs. Agreed, except that (2) is a subset of (1), so (1) alone is sufficient. (How can a read not be repeatable if there are no concurrent transactions?) I feel that I did provide a proof that the transactions couldn't represent any serial execution in my original email. This seems to be disputed with the argument that a SELECT from a database is not required to provide coherent data that represents some point in the serializable stream of transactions. I disagree, but as I pointed out previously, it is trivial to capture the results of any such query into a table and thereby persist the problem within the database. Here we go. I've labeled the transactions consistently with new thread I started trying to characterize the full scope of issues and workarounds. -- setup drop if exists ctl, receipt, receipt_totals; create table ctl (k text not null primary key, deposit_date date not null); insert into ctl values ('receipt', date '2008-12-22'); create table receipt (receipt_no int not null primary key, deposit_date date not null, amount numeric(13,2)); insert into receipt values (1, (select deposit_date from ctl where k = 'receipt'), 1.00); insert into receipt values (2, (select deposit_date from ctl where k = 'receipt'), 2.00); create table receipt_totals (deposit_date date not null primary key, next_date date not null, deposit_total numeric(13,2) not null); -- connection 1 (start of T0) start transaction isolation level serializable; insert into receipt values (3, (select deposit_date from ctl where k = 'receipt'), 4.00); -- connection 2 (T1) start transaction isolation level serializable; update ctl set deposit_date = date '2008-12-23' where k = 'receipt'; commit transaction; -- connection 2 (TN version 1) start transaction isolation level serializable; select * from ctl; -- (deposit_date shows as 2008-12-23) select * from receipt where deposit_date = date '2008-12-22'; -- (Only receipts 1 and 2 show for 2008-12-22.) commit transaction; -- connection 2 (TN version 2) start transaction isolation level serializable; insert into receipt_totals select r.deposit_date, c.deposit_date, sum(amount) from ctl c join receipt r on ( r.deposit_date< c.deposit_date and not exists ( select * from receipt r2 where r2.deposit_date< c.deposit_date and r2.deposit_date > r.deposit_date ) ) group by r.deposit_date,c.deposit_date; commit transaction; -- connection 1 (end of T0) commit transaction; After all this is done, a select from receipt_totals shows:deposit_date | next_date | deposit_total --------------+------------+---------------2008-12-22 | 2008-12-23 | 3.00 (1 row) Here goes the proof, although I'm not going to be overly formal in the language. (1) If these transactions were serialized, T0 must come before T1, because T0 uses the ctl.deposit_date before it is updated by T1. (2) If these transactions were serialized, T1 must come before TN (either version), because the TN transactions see the ctl.deposit_date set by T1. (3) If these transactions were serialized, the TN transactions must come before T0, since they don't see the row inserted by T0. (4) Since serialization requires that T0 < T1 < TN < T0 (comparing time sequence) the transactions cannot be considered serialized. -Kevin
pgsql-hackers by date: