Skip to content

Latest commit

 

History

History
258 lines (154 loc) · 16.4 KB

Lecture 15.md

File metadata and controls

258 lines (154 loc) · 16.4 KB

Distributed Systems Lecture 15

Lecture Given by Lindsey Kuper on May 4th, 2020 via YouTube

Previous Next
Lecture 14 Lecture 16

Course Admin...

...snip...

Read Amazon's Dynamo paper and Google's MapReduce paper.

...snip...

Problematic Exam Question: Chandy-Lamport Snapshot Bug

The following diagram shows a buggy implementation of the Chandy-Lamport snapshot algorithm.

Process P2 initiates the snapshot, but then something goes wrong. Where's the bug?

Chandy-Lamport Snapshot Bug

The Chandy-Lamport algorithm assumes FIFO delivery of all messages — irrespective of whether they are application or marker messages; so, if we trace through the steps shown in the diagram, we can discover the bug:

  • P2 initiates the snapshot so it records its own state (the green ellipse around event E), then immediately sends out a marker message to P1
  • P1 receives the marker message and immediately records its own state (the green ellipse around events A, B, C, and D) and then sends out its marker message
  • After P2 sends out its marker message, its snapshot is complete, and it continues processing events in the normal way — resulting in event F sending out an application message to P1.

The bug is created by the fact that this diagram shows a FIFO anomaly created when the application message from P2 event F overtakes the snapshot marker message.

As a result, P1 event D is recorded in P1's snapshot, but the event that caused it (P2 event F) is missing from P2's snapshot. Thus, our snapshot is not a consistent cut.

Remember that for a cut to be consistent, it must contain all events that led up to a certain point in time. So, the inclusion of event D in P1's snapshot is the problem because even D is the result of delivering a message from the future.

This is an example of a situation in which a FIFO anomaly (out of order message delivery) leads to a causal anomaly (an inconsistent cut).

Paxos: The Easy Parts

At the end of the last lecture, our discussion of the Paxos Algorithm got us up to here:

Paxos Consensus Reached

This was a very simple run of Paxos involving:

  • One proposer,
  • Three acceptors, and
  • Two learners

In this example, the proposer P sent out prepare messages to a majority of the acceptors, which in this case, was two out of three; however, it would be been equally valid for P to have sent prepare messages to all the acceptors. In fact, doing so would be quite smart because it mitigates against message loss, because on balance, even if one message is lost, you have still communicated with the majority of acceptors.

The same idea applies when the proposer listens for promise messages coming back from the acceptors. It only needs to hear from a majority of the acceptors before it can be happy. Exactly who those acceptors are is not important, and if it does hear back from all the acceptors then that's great, but it’s not a requirement. It just needs to hear from a majority.

So, when we speak of a majority, we are speaking of at least the minimum majority. For instance, if there are five acceptors, then the minimum majority is three: but if we hear back either from four, or even all five, then this is not a problem. The issue is that we must hear back from at least the minimum number of acceptors required to form a majority.

There are other subtleties involved in this algorithm that we will now go through, including what happens when there is more than one proposer.

Milestones in the Paxos Algorithm

One thing that was mentioned in the previous lecture was that three specific milestones are reached during a run of the Paxos algorithm. These are:

  1. When the proposer receives promise(n) messages from a majority of acceptors.

    Paxos Milestone 1

    A majority of acceptors have all promised to respond to the agreed proposal number n; and by implication, they have also promised to ignore any request with a proposal number lower than n.

  2. When a majority of acceptors all issue accepted(n,val) messages for proposal number n and some value val.

    Paxos Milestone 2

    Now, even though the other processes participating in the Paxos algorithm do not yet realise it, consensus has in fact been reached.

  3. When the proposer(s) and learners receive accepted(n,val) messages from a majority of the acceptors.

    Paxos Milestone 3

    It is only now that the proposer(s) and the learners realise that consensus has already been reached

Paxos: The Full Algorithm (Mostly)

A run of the Paxos algorithm involves the following sequence of message exchanges - primarily between the proposer and acceptors:

  1. The Proposer
    Sends out propose(n) messages to at least the minimum number of acceptors needed to form a majority. The proposal number n must be:

    • Unique
    • Higher than any previous proposal number used by this proposer

    It’s important to understand that the proposal number rules are applied to proposers individually. Consequently, if there are multiple proposers in the system, there does not need to be any agreement between proposers about what the next proposal number should be.

  2. The Acceptor
    When the acceptor receives a prepare(n) message, it asks itself "Have I already agreed to ignore proposals with this proposal number?". If the answer is yes, then the message is simply ignored; but if not, it replies to the proposer with a promise(n) message.

    By returning a promise(n) message, the acceptor has now committed to ignore all messages with a proposal number smaller than n.

  3. The Proposer
    When the proposer has received promise messages from a majority of acceptors for a particular proposal number n, it sends an accept(n,val) message to a majority of acceptors containing both the agreed proposal number n, and the value val that it wishes to propose.

  4. The Acceptor
    When an acceptor receives an accept(n,val) message, it asks the same question as before: "Have I already agreed to ignore messages with this proposal number?". If yes, it ignores the message; but if no, it replies with an accepted(n,val) both back to the proposer and broadcasts this acceptance to all the learners.

Up till now, we have assumed that there is only one proposer — but next, we must examine what happens if there are multiple proposers.

What Happens If There Is More Than One Proposer?

In this scenario, we will make two changes. We will run the Paxos algorithm with two proposers, and for visual clarity, since learners do not actually take part in the steps needed to reach consensus, we will omit them from the diagram.

Let's say we have two proposers P1 and P2 and as before, three acceptors. (We also have two learners, but we'll ignore them for the time being.)

Remember we previously stated that in situations where there are multiple proposers, these proposers must have already agreed on how they will ensure the uniqueness of their own proposal numbers. So, in this case, we will assume that:

  • Proposer P1 uses odd proposal numbers, and
  • Proposer P2 uses even proposal numbers

So, proposer P1 sends out a prepare(5) message to a majority of the acceptors. This is the first proposal number these acceptors have seen during this run of the protocol, so they are all happy to accept it and respond with promise(5) messages.

Proposer P1 is seeking consensus for value 1, so it now sends out accept(5,1) messages and the majority of acceptors respond with accepted(5,1)

Multiple Proposers 1

Ok, that's fine; we seem to have agreed on value 1.

Meanwhile, back in Gotham City, proposer P2 has no idea what's been going on, and decides to send out a prepare(4) message to all the acceptors...

Multiple Proposers 2

The prepare(4) message arrives at acceptors A1 and A2 after they have already agreed on proposal number 5. Since they are now ignoring proposal numbers less than 5, they simply ignore this message.

Acceptor A3 however has not seen proposal number 4 before, so it happily agrees to it and sends back a promise(4) message to proposer P2.

Proposer P2 is now left hanging.

It sent out prepare messages to all the acceptors but has only heard back from a minority of them. The rest have simply not answered, and given the way asynchronous communication works, P2 cannot know why it has not heard back from the other acceptors. They could have crashed, or they might be running slowly, or, as it turns out, the other acceptors have already agreed to P1's proposal and are now having his babies...

So, all P2 can do is wait for its timeout period, and if it doesn't hear back within that time, it concludes that proposal number 4 was a bad idea and tries again. This time, P2 shows up in a faster car (proposal number 6).

Multiple Proposers 3

But wait a minute, consensus (milestone 2) has already been reached, so the acceptors now have a problem because:

  • Acceptors cannot go back on their majority decision
  • Acceptors cannot ignore prepare messages with a higher proposal number

So, here's where we must address one of the subtleties that we previously glossed over.

Previously, we stated only that if an acceptor receives a prepare message with a lower proposal number, it should simply ignore it. Well, OK, that's fine.

But what about the case where we receive a proposal number that is higher than the last one? Here is where we need to further qualify how that prepare message should be handled.

In this case, each acceptor must consider the following situation:

"I've already promised to respond to proposal number n,
but now I'm being asked to promise to respond to proposal number n+1"

How the acceptor reacts now depends on what has happened in between receiving the prepare(n) message and the prepare(n+1) message.

Either way, the acceptor cannot ignore the higher proposal number; so it needs to send out some sort of promise message. However, but this time, the acceptor must consider whether it has already accepted a value based on some earlier, lower proposal number.

  • If no, then we accept the new proposal number with a promise(n+1) message as normal
  • If yes, then we accept the new proposal number with a promise(n+1, ...) message, but in addition, we are obligated to tell the new proposer that we've already agreed to go on a date with a proposer using a lower proposal number.

In the latter case, you can see that the promise message needs to carry some extra information.

In the above example, acceptor A1 has already agreed with proposer P1 that, using proposal number 5, the value should be 1; but now proposer P2 comes along and presents proposal number 6 to all the acceptors.

Multiple Proposers 4

So, in this specific situation, acceptor A3 responds simply with promise(6) because although it previously agreed to proposal number 4, nothing came of that, and it has not previously accepted any earlier value.

Acceptors A1 and A2 however, must respond with the message promise(6,(5,1)).

This extra information in the promise message effectively means: "Ok, I'll move with you to proposal number 6 but understand this: using proposal number 5, I've already accepted value 1".

So, What Should A Proposer Do with Such a Message?

Previously, we said that when a proposer receives sufficient promise(n) messages, it will then send out accept(n,val) messages. But here's where our description of the protocol needs to be refined. What should the proposer do if, instead of receiving a promise(n) message, it receives a promise(n,(nold,valold)) message?

In our example, proposer P2 has received three promise messages:

  • A straight-forward promise(6) from A3, and
  • Two promise(6,(5,1)) messages from A1 and A2

Proposer P2 must now take into account that using proposal number 5, consensus has already been reached on value 1.

In this case, both promise messages contain the value 1 that was agreed upon using proposal number 5; however, it is perfectly possible that P2 could receive multiple promise messages containing values agreed on by proposal numbers older than 5.

So, the rule is this: proposer P2 must look at all the older, already agreed upon values, and chose the value corresponding to the most recent, old proposal number.

This is pretty ironic (and amusing) really because proposer P2 now has no choice over what value to propose. It is constrained to propose the one value upon which consensus has most recently been reached! So, the fact that it wants to send out its own proposal is somewhat redundant, because the only value it can propose is one upon which consensus has already been agreed...

So, now we must revise rule 3 given above. Previously we stated:

When the proposer has received promise messages from a majority of acceptors for a particular proposal number n, it sends an accept(n,val) message to a majority of acceptors containing both the agreed proposal number n, and the value val that it wishes to propose.

But now we understand that the proposer does not have complete liberty to send out the value it wishes to propose; instead, it must first consider:

  • If I have received any promise messages containing old agreed values, then I am obligated to propose the most recently agreed value
  • If I have received only simple promise(n) messages, then I am free to propose any value I like

So now, P2 can only send out the message accept(6,1).

Multiple Proposers 5

Notice that P2 has not had to use the earlier proposal number 5, but it was constrained to propose the value 1, because this value has already been agreed upon.

So, what do the acceptors do now? They simply invoke rule 4 above and respond with accepted(6,1).

Multiple Proposers 6

Let's isolate the messages that were exchanged between proposer P2 and acceptor A3.

Multiple Proposers 7

A3 only sees the following exchange of messages.

  • P2 first tried proposal number 4, but nothing came of that
  • P2 tried again with proposal number 6
  • A3 went with the highest proposal number (6) and subsequently agreed to accept value 1

As far as A3 is concerned, it thinks that value 1 was P2's idea. It has no clue that P2 was proposing a value already agreed upon by others.


Previous Next
Lecture 14 Lecture 16