Re: High Availability, Load Balancing, and Replication Feature Matrix - Mailing list pgsql-docs
From | Markus Schiltknecht |
---|---|
Subject | Re: High Availability, Load Balancing, and Replication Feature Matrix |
Date | |
Msg-id | 4735DE93.8020109@bluegap.ch Whole thread Raw |
In response to | High Availability, Load Balancing, and Replication Feature Matrix (Bruce Momjian <bruce@momjian.us>) |
Responses |
Re: High Availability, Load Balancing, and Replication
Feature Matrix
|
List | pgsql-docs |
Hello Bruce, Bruce Momjian wrote: > I have added a High Availability, Load Balancing, and Replication > Feature Matrix table to the docs: Nice work. I appreciate your efforts in clearing up the uncertainty that surrounds this topic. As you might have guessed, I have some complaints regarding the Feature Matrix. I hope this won't discourage you, but I'd rather like to contribute to an improved variant. First of all, I don't quite like the negated formulations. I can see that you want a dot to mark a positive feature, but I find it hard to understand. I'm especially puzzled about is the "master never locks others". All first four, namely "shared disk failover", "file system replication", "warm standby" and "master slave replication", block others (the slaves) completely, which is about the worst kind of lock. Comparing between "File System Replication" and "Shared Disk Failover", you state that the former has "master server overhead", while the later doesn't. Seen solely from the single server node, this might be true. But summarized over the cluster, you have a network with a quite similar load in both cases. I wouldn't say one has less overhead than the other per definition. Then, you are mixing apples and oranges. Why should a "statement based replication solution" not require conflict resolution? You can build eager as well as lazy statement based replication solutions, that does not have anything to do with the other, does it? Same applies to "master slave replication" and "per table granularity". And in the special case of (async, but eager) Postgres-R also to "async multi-master replication" and "no conflict resolution necessary". Although I can understand that that's a pretty nifty difference. Given the matrix focuses on practically available solutions, I can see some value in it. But from a more theoretical viewpoint, I find it pretty confusing. Now, if you want a practically usable feature comparison table, I'd strongly vote for clearly mentioning the products you have in mind - otherwise the table pretends to be something it is not. If it should be theoretically correct without mentioning available solutions, I'd rather vote for explaining the terms and concepts. To clarify my viewpoint, I'll quickly go over the features you're mentioning and associate them with the concepts, as I understand them. - special hardware: always nice, not much theoretical effect, a network is a network, storage is storage. - multiple masters: that's what single- vs multi masters is about: writing transactions. Can be mixed with eager/lazy, every combination makes sense for certain applications. - overhead: replication per definition generates overhead, question is: how much, and where. - locking of others: again, question of how much and how fine grained the locking is. In a single master repl. sol., the slaves are locked completely. In lazy repl. sol., the locking is deferred until after the commit, during conflict resolution. In eager repl. sol., the locking needs to take place before the commit. But all replication systems need some kind of locks! - data loss on fail: solely dependent on eager/lazy. (Given a real replication, with a replica, which shared storage does not provide, IMO) - slaves read only: theoretically possible with all replication system, are they lazy/eager, single-/multi- master. That we are unable to read from slave nodes is an implementation annoyance of Postgres, if you want. - per table gran.: again, independent of lazy/eager, single-/multi. Depends solely on the level where data is replicated: block device, file system, statement, WAL or other internal format. - conflict resol.: in multi master systems, that depends on the lazy/eager property. Single master systems obviously never need to resolve conflicts. IMO, "data partitioning" is entirely perpendicular to replication. It can be combined, in various ways. There's horizontal and vertical partitioning, eager/lazy and single-/multi-master replication. I guess we could find a use case for most of the combinations thereof. (Kudos for finding a combination which definitely has no use case). Well, these are my theories, do with it whatever you like. Comments appreciated. Kind regards Markus
pgsql-docs by date: