Humming consensus, an animated example with an asymmetric network partition

Hi, all.  It’s been a happy & productive day working on some new code.  I’ve done it brute-force style, but it’s nice to see results.  Visual images are cool, and animation is very cool.  My animation quality is horrible, so that reduces coolness to something mediocre.  But I’m still excited because it’s a first!  It’s my first animated case study of humming consensus.  It’s also my first description of a case study, please don’t let that stop you from reading more.

See http://www.snookles.com/scotttmp/MyFirstMachiAnimation.mov for a very crude animation of a simulated network partition scenario. The animation is in QuickTime format and is about 3.3 MBytes. It is served by a 5 Mbit/sec network link, please be gentle & patient.

Here’s what the colors mean:

  • Green = UPI/fully-in-sync part of the chain
  • Purple = “Repair” part of the chain, i.e., servers whose data isn’t yet fully in sync with the green part of the chain.
  • Grey = A limited # of older epochs are superimposed to demonstrate disagreement at this instant in time.

In this case study, we have an asymmetric partition: ‘c’ can’t send messages to ‘a’, but all other message passing is OK.

We are seeing a view of “private” epochs.  (This algorithm separates “public” vs. “private” epochs, sorry, it’s out-of-scope to describe the difference here.)  The churn/thrashing that we see is disagreement by different “authors”. Each author is a participating server, a or b or c or d.  The constant disagreement is a sign that humming consensus uses to discover asymmetric messaging patterns.  I wish I understood why exactly this disagreement pattern always signals asymmetric messaging, but as far as I can tell, it discovers asymmetric messaging 100% of the time.

Not only is there disagreement between different views of the cluster; it is a particular kind of disagreement.   Each author server makes the same proposal over & over again.

  • Author a: [b,a,c,d], epochs=524,528,530,532
  • Author b: [b,a,c,d], epochs=522,525,527,529,531
  • Author c: [b,c,d], epochs=521,523,526
  • Author d: [b,a,c,d], epochs=533 … The exact same proposal by d is made in epochs 514, 516, and 518, but those epochs aren’t included in the animation, sorry!

This pattern has been going on for a while.  To shorten the animation, I’ve omitted many of the earlier epoch proposals.

When the network partition disappears at epoch=533, everyone quickly makes the same decision on how to heal the cluster.

Here are some additional notes about how to interpret the diagrams in the animation:

  • The number of edges show the agreement (or lack thereof) of which epoch proposal is “in use” by a participant. The letter labels at the base of each arrow indicate the source/user of that edge.
  • The latter part of the animation appears to go more slowly than the first part.  It’s a bit difficult to tell, but the epoch numbers and author name are indeed changing.  The algorithm prefers proposals by servers with “bigger” names.  The name ‘d’ is bigger than ‘a’, so when ‘a’ makes a proposal, ‘b’ and/or ‘c’ and/or ‘d’ might make the same proposal with a new & bigger epoch, just because they can.  (There’s a useful reason why the re-proposals are useful, but that reason isn’t obvious here, sorry!)
  • No, there is no replication to one’s self, e.g. epoch=536. I’ve put that there to try extra emphasis that it’s a chain of length 1.
  • I’ll have to demonstrate an example of a symmetric partition scenario elsewhere.  The humming consensus algorithm has no problem dealing with symmetric partitions, e.g., node X cannot send or receive messages to all other nodes in the cluster.  There is no churn, no indecision.
  • The code that I’ve been working on to create these animation frames can be found at https://github.com/basho/machi/commit/2a6e9e2e5c3137053329dd581ded23916f4aacfc.  (The current working branch is called slf/projection-browser.) The code there creates GraphViz “dot” files, which I then viewed using the OS X “Preview” app.  While using the “QuickTime Viewer” app to record a portion of my screen, I manually pressed the arrow keys to show different “frames” of the animation.

I’ll have another blog posting that will try to describe several things that I’ve omitted here: what a “public” or “private” projection is, and how the pattern of distributed disagreement is used to detect and compensate for asymmetric partitions.  Oh, yeah, and what an example of symmetric network partition looks like.

For more info, try the following links:

EDIT: 2014-10-17 12pm: Added “More info” section.

EDITED: 2014-10-16 12pm: Clarify that author d makes the same proposal and so is “flapping” in the same way, but those epochs aren’t included in the animation.

This entry was posted in Uncategorized and tagged , . Bookmark the permalink.