Re: Potential G2-item cycles under serializable isolation - Mailing list pgsql-bugs
From | Kyle Kingsbury |
---|---|
Subject | Re: Potential G2-item cycles under serializable isolation |
Date | |
Msg-id | 3150072e-13b8-7655-211b-17898f894e45@jepsen.io Whole thread Raw |
In response to | Re: Potential G2-item cycles under serializable isolation (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: Potential G2-item cycles under serializable isolation
Re: Potential G2-item cycles under serializable isolation Re: Potential G2-item cycles under serializable isolation |
List | pgsql-bugs |
OK! So I've designed a variant of this test which doesn't use ON CONFLICT. Instead, we do a homebrew sort of upsert: we try to update the row in place by primary key; if we see zero records updated, we insert a new row, and if *that* fails due to the primary key conflict, we try the update again, under the theory that since we now know a copy of the row exists, we should be able to update it. https://github.com/jepsen-io/jepsen/blob/f47eb25ab32529a7b66f1dfdd3b5ef2fc84ed778/stolon/src/jepsen/stolon/append.clj#L31-L108 This isn't bulletproof; even under SERIALIZABLE, Postgres will allow a transaction to fail to update any rows, then fail to insert due to a primary key conflict--but because the primary key conflict forces a transaction abort (or rolling back to a savepoint, which negates the conflict), I think it's still serializable. The isolation docs explain this, I think. Unfortunately, I'm still seeing tons of G2-item cycles. Whatever this is, it's not related to ON CONFLICT. As a side note, I can confirm that Postgres repeatable read is definitely weaker than repeatable read--at least as formalized by Adya et al. This makes sense if you understand repeatable read to mean SI (after all, the SI paper says they're incomparable!), but the Postgres docs seem to imply Postgres is strictly *stronger* than the ANSI SQL spec, and I'm not convinced that's entirely true: https://www.postgresql.org/docs/12/transaction-iso.html > The table also shows that PostgreSQL's Repeatable Read implementation does not allow phantom reads. Stricter behavior is permitted by the SQL standard... > This is a stronger guarantee than is required by the SQL standard for this isolation level... I get the sense that the Postgres docs have already diverged from the ANSI SQL standard a bit, since SQL 92 only defines three anomalies (P1, P2, P3), and Postgres defines a fourth: "serialization anomaly". This results in a sort of weird situation: on the one hand, it's true: we don't (I think) observe P1 or P2 under Postgres Repeatable Read. On the other hand, SQL 92 says that the difference between repeatable read and serializable is *exactly* the prohibition of P3 ("phantom"). Even though all our operations are performed by primary key, we can observe a distinct difference between Postgres repeatable read and Postgres serializable. I can see two ways to reconcile this--one being that Postgres chose the anomaly interpretation of the SQL spec, and the result is... maybe internally inconsistent? Or perhaps one of the operations in this workload actually *is* a predicate operation--maybe by dint of relying on a uniqueness constraint? I'm surprised the transaction isolation docs don't say something like "Postgres repeatable read level means snapshot isolation, which is incomparable with SQL repeatable read." Then it'd be obvious that Postgres repeatable read exhibits G2-item! Weirdly, the term "snapshot isolation" doesn't show up on the page at all! --Kyle
pgsql-bugs by date: