Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node - Mailing list pgsql-hackers
From | Aidan Van Dyk |
---|---|
Subject | Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node |
Date | |
Msg-id | CAC_2qU_VAfwyaxNLL6qQuC-9upnKYrad39wTRxuSnwP-jFQyPQ@mail.gmail.com Whole thread Raw |
In response to | Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node (Andres Freund <andres@2ndquadrant.com>) |
Responses |
Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node
|
List | pgsql-hackers |
On Wed, Jun 20, 2012 at 3:15 PM, Andres Freund <andres@2ndquadrant.com> wrote: > To recap why we think origin_id is a sensible design choice: > > There are many sensible replication topologies where it does make sense that > you want to receive changes (on node C) from one node (say B) that originated > from some other node (say A). > Reasons include: > * the order of applying changes should be as similar as possible on all nodes. > That means when applying a change on C that originated on B and if changes > replicated faster from A->B than from A->C you want to be at least as far with > the replication from A as B was. Otherwise the conflict ratio will increase. > If you can recreate the stream from the wal of every node and still detect > where an individual change originated, thats easy. OK, so in this case, I still don't see how the "origin_id" is even enough. C applies the change originally from A (routed through B, because it's faster). But when it get's the change directly from A, how does it know to *not* apply it again? > * the interconnects between some nodes may be more expensive than from others > * an interconnect between two nodes may fail but others dont > > Because of that we think its sensible to be able generate the full LCR stream > with all changes, local and remote ones, on each individual node. If you then > can filter on individual origin_id's you can build complex replication > topologies without much additional complexity. > >> I'm not saying that we need to implement all possible conflict >> resolution algorithms right now - on the contrary I think conflict >> resolution belongs outside core - but if we're going to change the WAL >> record format to support such conflict resolution, we better make sure >> the foundation we provide for it is solid. > I think this already provides a lot. At some point we probably want to have > support for looking on which node a certain local xid originated and when that > was originally executed. While querying that efficiently requires additional > support we already have all the information for that. > > There are some more complexities with consistently determining conflicts on > changes that happened in a very small timewindown on different nodes but thats > something for another day. > >> BTW, one way to work around the lack of origin id in the WAL record >> header is to just add an origin-id column to the table, indicating the >> last node that updated the row. That would be a kludge, but I thought >> I'd mention it.. > Yuck. The aim is to improve on whats done today ;) > > -- > Andres Freund http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Aidan Van Dyk Create like a god, aidan@highrise.ca command like a king, http://www.highrise.ca/ work like a slave.
pgsql-hackers by date: