Re: Global snapshots - Mailing list pgsql-hackers
From | Andrey Borodin |
---|---|
Subject | Re: Global snapshots |
Date | |
Msg-id | 19F4097D-0525-4C4E-A40F-2C8C2B0787CF@yandex-team.ru Whole thread Raw |
In response to | Re: Global snapshots (Arseny Sher <a.sher@postgrespro.ru>) |
Responses |
Re: Global snapshots
|
List | pgsql-hackers |
Hi! I want to review this patch set. Though I understand that it probably will be quite long process. I like the idea that with this patch set universally all postgres instances are bound into single distributed DB, even ifthey never heard about each other before :) This is just amazing. Or do I get something wrong? I've got few questions: 1. If we coordinate HA-clusters with replicas, can replicas participate if their part of transaction is read-only? 2. How does InDoubt transaction behave when we add or subtract leap seconds? Also, I could not understand some notes from Arseny: > 25 июля 2018 г., в 16:35, Arseny Sher <a.sher@postgrespro.ru> написал(а): > > * One drawback of these patches is that only REPEATABLE READ is > supported. For READ COMMITTED, we must export every new snapshot > generated on coordinator to all nodes, which is fairly easy to > do. SERIALIZABLE will definitely require chattering between nodes, > but that's much less demanded isolevel (e.g. we still don't support > it on replicas). If all shards are executing transaction in SERIALIZABLE, what anomalies does it permit? If you have transactions on server A and server B, there are transactions 1 and 2, transaction A1 is serialized before A2,but B1 is after B2, right? Maybe we can somehow abort 1 or 2? > > * Another somewhat serious issue is that there is a risk of recency > guarantee violation. If client starts transaction at node with > lagging clocks, its snapshot might not include some recently > committed transactions; if client works with different nodes, she > might not even see her own changes. CockroachDB describes at [1] how > they and Google Spanner overcome this problem. In short, both set > hard limit on maximum allowed clock skew. Spanner uses atomic > clocks, so this skew is small and they just wait it at the end of > each transaction before acknowledging the client. In CockroachDB, if > tuple is not visible but we are unsure whether it is truly invisible > or it's just the skew (the difference between snapshot and tuple's > csn is less than the skew), transaction is restarted with advanced > snapshot. This process is not infinite because the upper border > (initial snapshot + max skew) stays the same; this is correct as we > just want to ensure that our xact sees all the committed ones before > it started. We can implement the same thing. I think that this situation is also covered in Clock-SI since transactions will not exit InDoubt state before we can seethem. But I'm not sure, chances are that I get something wrong, I'll think more about it. I'd be happy to hear commentsfrom Stas about this. > > > * 003_bank_shared.pl test is removed. In current shape (loading one > node) it is useless, and if we bombard both nodes, deadlock surely > appears. In general, global snaphots are not needed for such > multimaster-like setup -- either there are no conflicts and we are > fine, or there is a conflict, in which case we get a deadlock. Can we do something with this deadlock? Will placing an upper limit on time of InDoubt state fix the issue? I understandthat aborting automatically is kind of dangerous... Also, currently hanging 2pc transaction can cause a lot of headache for DBA. Can we have some kind of protection for thecase when one node is gone permanently during transaction? Thanks! Best regards, Andrey Borodin.
pgsql-hackers by date: