Re: PostgreSQL XAResource & GlassFish 3.1.2.2 - Mailing list pgsql-jdbc
From | Bryan Varner |
---|---|
Subject | Re: PostgreSQL XAResource & GlassFish 3.1.2.2 |
Date | |
Msg-id | 511A675B.3020207@polarislabs.com Whole thread Raw |
In response to | Re: PostgreSQL XAResource & GlassFish 3.1.2.2 (Heikki Linnakangas <hlinnakangas@vmware.com>) |
Responses |
Re: PostgreSQL XAResource & GlassFish 3.1.2.2
Re: PostgreSQL XAResource & GlassFish 3.1.2.2 |
List | pgsql-jdbc |
>> * Race conditions as multiple threads are participating in the same >> transaction, invoking XAResource methods. Status checks in >> PGXAConnection.java are throwing exceptions (if state == ACTIVE) >> throw) by the time in invokes the throw, the state is != ACTIVE) >> Before you start telling me I shouldn't be using threads in a JEE >> environment let me remind you that EJBs by default are served out of >> thread pools, and allow for concurrent threads to participate within a >> single TX scope. This is outlined as part of the transaction context >> in the JTS spec (section 2.2 and 2.3) and synchronized thread-safe >> access to XAResources is described (without being explicitly called >> out) by the JTA 1.0.1 spec. > > We could fairly easily just add "synchronized" to all the public > methods. I wonder how sane it is for Glassfish to be doing that in the > first place, though. AFAICS, in any combination of two XAResource > methods, called concurrently, one of the threads will get an error > anyway. For example, if two threads try to call start() at the same > time, one of them has to fail because an XAResource can only be > associated with a one transaction at a time. I think there's some confusion between a thread and a logical transaction (represented by a physical connection to the db), with an XID managed by a Transaction Manager. In an JEE container, it's expected that multiple threads will do work on behalf on a single XAResource, managed by the transaction manager. A single XID (XAResource) will have multiple threads doing work on their behalf. This does not necessitate interleaving, but it does mean that multiple threads can be invoking start() and end() on an XAResource. >> * It appears that a second thread attempting to join an existing >> XAResource's scope with start(XID, TMJOIN) is going to be refused, >> even if it's attempting to participate in the same XID. The exception >> thrown is one complaining about interleaving, even though it's the >> -same- XID, not a sub-set of work in another TX. > > Hmm, so the application server is trying to do something like this: > > xares.start(1234, 0); > xares.start(1234, TMJOIN); > > We could easily allow that in the driver (ie. do nothing), but that > doesn't seem like valid sequence of calls to begin with. You must > disassociate the XAResource from the current transaction with end(), > before re-associating it with start(). You're correct, after doing a bunch more reading, the code path above is invalid. What should be valid (and is not considered interleaving), is: Thread A Thread B -------- --------- xares.start(1234, TMNOFLAGS); doStuff(); xares.end(1234); xares.start(1234, TMJOIN); doStuff(); xares.end(1234); xares.start(1234, TMJOIN); doStuff(); xares.end(1234); So long as the TM is serializing execution of A and B and not allowing branch interleaving. In this case, the XAResource is preforming work on behalf of more than one thread, but in the same XID context. The problem I think I'm seeing at this point (still trying to coordinate with the glassfish devs) is that the XAResource isn't fully completing execution of end() prior to the other thread invoking start(), even though the method invocation appears to be happening 'in order'. This would manifest as a classic race condition, and would not constitute transaction interleaving, since the XID in use is the same TX branch. I'm working on a test case as part of the XAResource test suite in the driver codebase, as I'm doing this, I'm trying to nail down how glassfish is synchronizing access to XAResources, so this is taking me some time. What I can tell you, is that I'm seeing exception cases in my prod environment where the currentXid.equals(xid), but where the state field in XAConnection hasn't been updated by a concurrent calls to start()/end() in time to pass the interleaving pre-condition checks. My current hypothesis is that GF isn't trying to do interleaving, but the internal state field isn't being updated 'fast enough' (thread-safely) to avoid race conditions in non-interleaved, but multi-threaded environments. Regards, -Bryan
pgsql-jdbc by date: