Re: nested transactions - Mailing list pgsql-hackers
From | Manfred Koizar |
---|---|
Subject | Re: nested transactions |
Date | |
Msg-id | nv4cuu8sj987ipagso2faoh1soqarj2qm7@4ax.com Whole thread Raw |
In response to | Re: nested transactions (Bruce Momjian <pgman@candle.pha.pa.us>) |
Responses |
Re: nested transactions
Re: nested transactions |
List | pgsql-hackers |
On Wed, 27 Nov 2002 22:47:33 -0500 (EST), Bruce Momjian <pgman@candle.pha.pa.us> wrote: >The interesting issue is that if we could set the commit/abort bits all >at the same time, we could have the parent/child dependency local to the >backend --- other backends don't need to know the parent, only the >status of the (subtransaction's) xid, and they need to see all those >xid's committed at the same time. You mean the commit/abort bit in the tuple headers? Yes, this would be interesting, but I see no way how this could be done. If it could, there would be no need for pg_clog. Reading your paragraph above one more time I think you mean the bits in pg_clog: Each subtransaction gets its own xid. On ROLLBACK the abort bits of the aborted (sub)transaction and all its children are set in pg_clog immediately. This operation does not have to be atomic. On subtransaction COMMIT nothing happens to pg_clog, the status is only changed locally, the subtransaction still looks "in progress" to other backends. Only when the main transaction commits, we set the commit bits of the main transaction and all its non-aborted children in pg_clog. This action has to be atomic. Right? AFAICS the problem lies in updating several pg_clog bits at once. How can this be done without holding a potentially long lasting lock? >You could store the backend slot id in pg_clog rather than the parent >xid and look up the status of the outer xid for that backend slot. That >would allow you to use 2 bytes, with a max of 16k backends. The problem >is that on a crash, the pg_clog points to invalid slots --- it would >probably have to be cleaned up on startup. Again I would try to keep pg_clog compact and store the backend slots in another file, thus not slowing down instances where subtransactions are nor used. Apart from this minor detail I don't see, how this is supposed to work. Could you elaborate? >But still, you have an interesting idea of just setting the bit to be "I >am a child". The idea was to set subtransaction bits in the tuple header. Here is yet another different idea: Let the currently unused fourth state in pg_clog indicate a committed subtransaction. There are two bits per transaction, commit and abort, with the following meaning: a c0 0 transaction in progress, the owning backend knows whether it is a main- or a sub-transaction, other backendsdon't care1 0 aborted, nobody cares whether main- or sub-transaction0 1 committed main-transaction (*)1 1 committedsub-transaction, have to look for parent in pg_subtrans If we allow the 1/1 state to be replaced with 0/1 or 1/0 (on the fly as a side effect of a visibility check, or by vacuum, or by COMMIT/ROLLBACK), this could save a lot of parent lookups without having to touch the xids in the tuple headers. So (*) should read: committed main-transaction or committed sub-transaction having a committed parent. >The trick is allowing backends to figure out who's child >you are. We could store this somehow in shared memory, but that is >finite and there can be lots of xid's for a backend using >subtransactions. The subtrans dependencies have to be visible to all backends. Store them to disk just like pg_clog. In older proposals I spoke of a pg_subtrans "table" containing (parent, child) pairs. This was only meant as a concept, not as a real SQL table subject to MVCC. An efficient(?) implementation could be an array of parent xids, indexed by child xid. Most of it can be stolen from the clog code. One more argument for pg_subtrans being visible to all backends: If an UPDATE is about to change a tuple touched by another active transaction, it waits for the other transaction to commit or abort. We must always wait for the main transaction, not the subtrans. >I still think there must be a clean way, I hope so ... > but I haven't figured it out yet. Are we getting nearer? ServusManfred
pgsql-hackers by date: