Thread: Question About Serializable

Question About Serializable

From
Aaron Carlisle
Date:
The documentation states that "concurrent execution of a set of Serializable transactions is guaranteed to produce the same effect as running them one at a time in some order."

I'm not sure how the following behavior fits that definition. (Note that this is just an experiment, not a use case. Purely academic.) I run these transactions sequentially, and I get a different result than if I run them concurrently.

This is in 9.3.0

First, I set up a table.

create table x (value int);

Then I run the following transactions. If I run them sequentially, in either order, I get one row in table x. If I run them concurrently, I get no rows in x.

It seems like one of these should error out and not commit, so I must be missing some stipulation.

Feel free to repeat this result.

=========
begin;

set transaction isolation level serializable;

create table z ();

select pg_sleep(5);

insert into x (value)
  select 0
  where exists (select relname from pg_class
                where relname = 'y')
    and exists (select relname from pg_class
                where relname = 'z');
commit;

=========
begin;

set transaction isolation level serializable;

create table y ();

select pg_sleep(5);

insert into x (value)
  select 0
  where exists (select relname from pg_class
                where relname = 'y')
    and exists (select relname from pg_class
                where relname = 'z');
commit;


Re: Question About Serializable

From
Kevin Grittner
Date:
Aaron Carlisle <aaron.carlisle@gmail.com> wrote:

> The documentation states that "concurrent execution of a set of
> Serializable transactions is guaranteed to produce the same
> effect as running them one at a time in some order."
>
> I'm not sure how the following behavior fits that definition.
> (Note that this is just an experiment, not a use case. Purely
> academic.) I run these transactions sequentially, and I get a
> different result than if I run them concurrently.

> It seems like one of these should error out and not commit, so I
> must be missing some stipulation.

> [ write skew in pg_class with table creation and check for table
> existence in two concurrent queries ]

You are correct, and not missing anything -- when serializable
behavior was added (and through 9.3) the system catalogs were not
fully transactional using the MVCC semantics.  I'm afraid there
were many strange things that could happen with the system
catalogs, including having an object which was being updated be
missed or seen twice by a concurrent transaction, none of which was
documented.  Since the serializable techniques in PostgreSQL are
based on the MVCC snapshot isolation techniques, it was not
possible to cover DDL in the serializable implementation using
Serializable Snapshot Isolation.

I assume you can't create such problems in a stable schema; but
only when serializable transactions include DDL?

On the 9.4 development branch access to the system catalogs has
recently been made to use MVCC techniques, so it may be possible to
extend serializable behavior to the catalogs in 9.4 or later.

I don't suppose you want to help develop a patch for that?  :-)

Out of curiosity, is there a particular academic effort this
question came out of that you can talk about?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Question About Serializable

From
Aaron Carlisle
Date:
No particular effort; I saw a talk on the topic. I said it's academic because I can't think of any real world example where this would matter (the other definition of the word "academic").

On Thu, Sep 19, 2013 at 12:44 AM, Kevin Grittner <kgrittn@ymail.com> wrote:
Aaron Carlisle <aaron.carlisle@gmail.com> wrote:

> The documentation states that "concurrent execution of a set of
> Serializable transactions is guaranteed to produce the same
> effect as running them one at a time in some order."
>
> I'm not sure how the following behavior fits that definition.
> (Note that this is just an experiment, not a use case. Purely
> academic.) I run these transactions sequentially, and I get a
> different result than if I run them concurrently.

> It seems like one of these should error out and not commit, so I
> must be missing some stipulation.

> [ write skew in pg_class with table creation and check for table
> existence in two concurrent queries ]

You are correct, and not missing anything -- when serializable
behavior was added (and through 9.3) the system catalogs were not
fully transactional using the MVCC semantics.  I'm afraid there
were many strange things that could happen with the system
catalogs, including having an object which was being updated be
missed or seen twice by a concurrent transaction, none of which was
documented.  Since the serializable techniques in PostgreSQL are
based on the MVCC snapshot isolation techniques, it was not
possible to cover DDL in the serializable implementation using
Serializable Snapshot Isolation.

I assume you can't create such problems in a stable schema; but
only when serializable transactions include DDL?

On the 9.4 development branch access to the system catalogs has
recently been made to use MVCC techniques, so it may be possible to
extend serializable behavior to the catalogs in 9.4 or later.

I don't suppose you want to help develop a patch for that?  :-)

Out of curiosity, is there a particular academic effort this
question came out of that you can talk about?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company