Re: eXtensible Transaction Manager API - Mailing list pgsql-hackers
From | Konstantin Knizhnik |
---|---|
Subject | Re: eXtensible Transaction Manager API |
Date | |
Msg-id | 563E2C8C.5000204@postgrespro.ru Whole thread Raw |
In response to | Re: eXtensible Transaction Manager API (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: eXtensible Transaction Manager API
Re: eXtensible Transaction Manager API Re: eXtensible Transaction Manager API |
List | pgsql-hackers |
Hi,
Thank you for your feedback.
My comments are inside.
On 11/07/2015 05:11 PM, Amit Kapila wrote:
Thank you for your feedback.
My comments are inside.
On 11/07/2015 05:11 PM, Amit Kapila wrote:
Today, while studying your proposal and related material, I noticedthat in both the approaches DTM and tsDTM, you are talking aboutcommitting a transaction and acquiring the snapshot consistently, butnot touched upon the how the locks will be managed across nodes andhow deadlock detection across nodes will work. This will also be oneof the crucial points in selecting one of the approaches.
Lock manager is one of the tasks we are currently working on.
There are still a lot of open questions:
1. Should distributed lock manager (DLM) do something else except detection of distributed deadlock?
2. Should DLM be part of XTM API or it should be separate API?
3. Should DLM be implemented by separate process or should it be part of arbiter (dtmd).
4. How to globally identify resource owners (0transactions) in global lock graph. In case of DTM we have global (shared) XIDs,
and in tsDTM - global transactions IDs, assigned by application (which is not so clear how to retrieve).
In other cases we may need to have local->global transaction id mapping, so looks like DLM should be part of DTM...
Also I havenoticed that discussion about Rollback is not there, example how willRollback happen with API's provided in your second approach (tsDTM)?
In tsDTM approach two phase commit is performed by coordinator and currently is using standard PostgreSQL two phase commit:
Code in GO performing two phase commit:
exec(conn1, "prepare transaction '" + gtid + "'")
exec(conn2, "prepare transaction '" + gtid + "'")
exec(conn1, "select dtm_begin_prepare($1)", gtid)
exec(conn2, "select dtm_begin_prepare($1)", gtid)
csn = _execQuery(conn1, "select dtm_prepare($1, 0)", gtid)
csn = _execQuery(conn2, "select dtm_prepare($1, $2)", gtid, csn)
exec(conn1, "select dtm_end_prepare($1, $2)", gtid, csn)
exec(conn2, "select dtm_end_prepare($1, $2)", gtid, csn)
exec(conn1, "commit prepared '" + gtid + "'")
exec(conn2, "commit prepared '" + gtid + "'")
If commit at some of the nodes failed, coordinator should rollback prepared transaction at all nodes.
Similarly, having some discussion on parts of recovery that could be affectedwould be great.
We are currently implementing fault tolerance and recovery for DTM approach (with centralized arbiter).
There are several replicas of arbiter, synchronized using RAFT protocol.
But with tsDTM approach recovery model is still obscure...
We are thinking about it.
I think in this patch, it is important to see the completeness of all theAPI's that needs to be exposed for the implementation of distributedtransactions and the same is difficult to visualize without having completepicture of all the components that has some interaction with the distributedtransaction system. On the other hand we can do it in incremental fashionas and when more parts of the design are clear.
That is exactly what we are going to do - we are trying to integrate DTM with existed systems (pg_shard, postgres_fdw, BDR) and find out what is missed and should be added. In parallel we are trying to compare efficiency and scalability of different solutions.
For example we still considering scalability problems with tsDTM approach: to provide acceptable performance, it requires very precise clock synchronization (we have to use PTP instead of NTP). So it may be waste of time trying to provide fault tolerance for tsDTM if we finally found out that this approach can not provide better scalability than simpler DTM approach.
pgsql-hackers by date: