Re: [HACKERS] WIP: Failover Slots - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: [HACKERS] WIP: Failover Slots |
Date | |
Msg-id | CA+Tgmob=sT4am9k71Rz465vBj9-hWoMdrFLrP4aE5Nr3pkmYBw@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] WIP: Failover Slots (Craig Ringer <craig@2ndquadrant.com>) |
Responses |
Re: [HACKERS] WIP: Failover Slots
|
List | pgsql-hackers |
On Tue, Aug 8, 2017 at 4:00 AM, Craig Ringer <craig@2ndquadrant.com> wrote: >> - When a standby connects to a master, it can optionally supply a list >> of slot names that it cares about. > > Wouldn't that immediately exclude use for PITR and snapshot recovery? I have > people right now who want the ability to promote a PITR-recovered snapshot > into place of a logical replication master and have downstream peers replay > from it. It's more complex than that, as there's a resync process required > to recover changes the failed node had sent to other peers but isn't > available in the WAL archive, but that's the gist. > > If you have a 5TB database do you want to run an extra replica or two > because PostgreSQL can't preserve slots without a running, live replica? > Your SAN snapshots + WAL archiving have been fine for everything else so > far. OK, so what you're basically saying here is that you want to encode the failover information in the write-ahead log rather than passing it at the protocol level, so that if you replay the write-ahead log on a time delay you get the same final state that you would have gotten if you had replayed it immediately. I hadn't thought about that potential advantage, and I can see that it might be an advantage for some reason, but I don't yet understand what the reason is. How would you imagine using any version of this feature in a PITR scenario? If you PITR the master back to an earlier point in time, I don't see how you're going to manage without resyncing the replicas, at which point you may as well just drop the old slot and create a new one anyway. Maybe you're thinking of a scenario where we PITR the master and also use PITR to rewind the replica to a slightly earlier point? But I can't quite follow what you're thinking about. Can you explain further? > Requiring live replication connections could also be an issue for service > interruptions, surely? Unless you persist needed knowledge in the physical > replication slot used by the standby to master connection, so the master can > tell the difference between "downstream went away for while but will come > back" and "downstream is gone forever, toss out its resources." I don't think the master needs to retain any resources on behalf of the failover slot. If the slot has been updated by feedback from the associated standby, then the master can toss those resources immediately. When the standby comes back on line, it will find out via a protocol message that it can fast-forward the slot to whatever the new LSN is, and any WAL files before that point are irrelevant on both the master and the standby. > Also, what about cascading? Lots of "pull" model designs I've looked at tend > to fall down in cascaded environments. For that matter so do failover slots, > but only for the narrower restriction of not being able to actually decode > from a failover-enabled slot on a standby, they still work fine in terms of > cascading down to leaf nodes. I don't see the problem. The cascaded standby tells the standby "I'm interested in the slot called 'craig'" and the standby says "sure, I'll tell you whenever 'craig' gets updated" but it turns out that 'craig' is actually a failover slot on that standby, so that standby has said to the master "I'm interested in the slot called 'craig'" and the master is therefore sending updates to that standby. Every time the slot is updated, the master tells the standby and the standby tells the cascaded standby and, well, that all seems fine. Also, as Andres pointed out upthread, if the state is passed through the protocol, you can have a slot on a standby that cascades to a cascaded standby; if the state is passed through the WAL, all slots have to cascade from the master. Generally, with protocol-mediated failover slots, you can have a different set of slots on every replica in the cluster and create, drop, and reconfigure them any time you like. With WAL-mediated slots, all failover slots must come from the master and cascade to every standby you've got, which is less flexible. I don't want to come on too strong here. I'm very willing to admit that you may know a lot more about this than me and I am really extremely happy to benefit from that accumulated knowledge. If you're saying that WAL-mediated slots are a lot better than protocol-mediated slots, you may well be right, but I don't yet understand the reasons, and I want to understand the reasons. I think this stuff is too important to just have one person saying "here's a patch that does it this way" and everybody else just says "uh, ok". Once we adopt some proposal here we're going to have to continue supporting it forever, so it seems like we'd better do our best to get it right. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: