Lecture Given by Lindsey Kuper on April 13th, 2020 via YouTube
Previous | Next |
---|---|
Lecture 6 | Lecture 8 |
Here's the classic causal anomaly again.
The problem here is that the message Carol
receives first is also delivered first.
The anomaly occurs because of out of order delivery, not out of order receipt.
The use of vector clocks helps eliminate the problem of causal anomalies because by comparing the vector clock value on the incoming message with its own vector clock, the receiver can decide whether or not the message should be delivered.
In this case, when Bob
receives Alice
's message, its fine for him to deliver it because Bob
's vector clock of [0,0,0]
differs only by one from the vector clock of the incoming message, and this difference is in Alice
's position.
Therefore, since it came from Alice
, it is correct to conclude that this message is the next event in both Alice
and Bob
's causal history.
Bob
then sends out his rude broadcast message in reply to Alice
, but Carol
receives this reply before the original message from Alice
.
By comparing the vector clock value on Bob
's message with her own vector clock, Carol
can determine that this message should be queued, and not delivered yet.
Carol
does this by noticing that Bob
's vector clock position is one greater than her vector clock position for him.
Since this message came from Bob
, this difference is expected.
Ok, so far so good.
However, the message vector clock also has a 1
in Alice
's position — and this is not expected.
Remember that in this scenario, our vector clocks only count message-send events; so when Carol
sees a 1
in Alice
's position, this means that some message send event has taken place in Alice
that Carol
does not yet know about (probably because the messages have been received out of order).
Hence, Carol
can correctly infer that this is a message from the future and should therefore be queued.
When Alice
's delayed message finally arrives at Carol
, this message carries a vector clock value of [1,0,0]
which is what Carol
expects, so this message can be correctly delivered.
Now Carol
examines her message queue and discovers that the message with vector clock [1,1,0]
can now be delivered, because all the preceding messages in her causal history have been delivered in the correct order.
-
Before sending a message to process
P2
, processP1
records this send event by incrementing its own position in its local vector clock. The updated vector clock is then sent with the message as metadata. -
At such time as
P2
deliversP1
's message,P2
increments theP1
clock value in its own local vector clock (I.E. it records the delivery ofP1
's message) -
Process
P2
should only deliverP1
's message if the clock value received with that message conforms to the following two rules:-
The clock value for process
P1
in the message must be exactly one bigger than the clock value for processP1
in the receiving processP2
. Or written more algebraically:VCmsg[P1] = VCP2[P1] + 1
and
-
The message clock values for all other positions must be less than or equal to the corresponding positions in
P2
's vector clock. Or, for all positionsk
in the vector clock where(k ≠ P1)
:VCmsg[Pk] ≤ VCP2[Pk]
-
There are two important things to remember here:
- We are only incrementing vector clock values on message-send events, not message-receive events.
The value in each vector clock position is simply a count of the number of messages that process has sent so far. - These rules only apply to broadcast messages, not point-to-point messages!
This means that every process in the system will (eventually) receive every message sent by every other process.
Rule 1: VCmsg[P1] = VCP2[P1] + 1
Knowing this, we can understand the above rule to mean that in order to avoid creating a causal anomaly (I.E. by delivering a message out of order), the receiver's local clock value for the sender must be exactly one smaller than the sender's clock value received in the message.
Rule 2: VCmsg[Pk] ≤ VCP2[Pk], (k ≠ P1)
The second rule means that the number of message-sends performed by all the other processes in the system (I.E. the vector clock values) must be no bigger than the values recorded in the receiver's local clock. In other words, the receiver keeps a complete record of all broadcast messages sent in the system; no message-send events are missing.
This is the rule that would be violated if Carol
tried to deliver Bob
's rude "Up yours!" message from the future at the time it was received.
The vector clock on the message sent from Bob
to Carol
is [1,1,0]
, but Carol
's vector clock is [0,0,0]
.
The 1
in Bob
's position is correct because he is sending the message, and it is one greater than Carol
's local value, but the 1
in Alice
's position is a problem.
According to this, Alice
has sent out a broadcast message, but Carol
has no record of it — yet.
Since 1
is not ≤ 0
, Carol
correctly concludes that this newly arrived message has been received out of order and therefore must be queued until such time as we receive the delayed message from Alice
.
Here. Alice
sends a broadcast message to Bob
and Carol
saying "I lost my wallet :-("
In the absence of any other messages, its fine for both Bob
and Carol
to deliver this message.
A little later, Alice
finds her wallet and tells Bob
and Carol
about this happy event.
However, Alice
's message to Carol
gets delayed.
It’s fine for Bob
to deliver this message because it came from Alice
and is the next expected message from her.
Bob
is very happy that Alice
has found her wallet and sends out a broadcast message to both Alice
and Carol
saying how pleased he is.
Unfortunately, Alice
's original message doesn't reach Carol
until after Bob
's response arrives.
Fortunately, Carol
recognises that Bob
's response is a message from the future, and places it in her message queue.
Otherwise, she would think that Bob
is a complete jerk because as far as Carol
is concerned, Alice
has just lost her wallet and Bob
seems to be really happy about this!
Only after Carol
delivers the preceding message, does she then deliver the message that arrived too early.
Remember what a total order anomaly looks like:
If we assign vector clocks to these send-events and then apply the rules of causal broadcast, we will discover that causal delivery cannot rule out total-order anomalies.
This is because, if we look back at the hierarchy of delivery guarantees, we can see that Total Order exists on its own branch from Casual and FIFO.
So how can we rule out causal anomalies and, at the same time, ensure total order?
Well, if we want to maintain both causal order and total order, then we will need something more than vector clocks and the causal broadcast algorithm we have just described.
In general, it’s pretty annoying to have to enforce total order, so it’s much easier not to enforce it unless you really have to.
The happens before relation is an example of potential causality.
We have already spoken of two of them:
- Ordering of events can be determined after the fact by assigning vector clocks (useful for debugging)
- Causal ordering of events as they happen (E.G. Using vector clocks to achieve causal broadcast)
These techniques provide ordering guarantees that prevent causal anomalies.
Another thing we have not mentioned yet is something called Consistent Global Snapshot. This is related to the first point above and is a way to obtain a picture of the global state of a distributed system. However, this is far from trivial to implement because not only does every process in a system have its own state, every process also has its own idea of the state of every other process.
One thing we can say is that:
If
A->B
andB
is in the snapshot, then we should also expect to findA
in the snapshot
So, the Consistent Global Snapshot is another important use of the happens before relation (or potential causality)
As has already been pointed out, in a distributed system, there really is no such things as an "observable" global state. Each process has its own state that represents the sum total of activity in that process up until some particular point. This includes the state of the process' internal memory.
So, we could lasso all the events and internal variables of a process and call that the state...
But what about the state of all the other processes in the system?
One approach might be to use a global clock and inform every process that at a certain time of day (say 09:20
), they must all take a snapshot of themselves.
However, as we have already seen in lecture 2, this approach won't work reliably — not because a process can't take a selfie (so to speak), but because synchronising clocks between computers is a notoriously difficult task.
Uh oh! Now we have a problem. Remember what we said above:
If
A->B
andB
is in the snapshot, then we should also expect to findA
in the snapshot
From the diagram, we can see that the message-send event E
in P1
causes event W
in P2
.
Therefore E(P1) -> W(P2)
But since P2
's clock is running slightly slower than P1
's, the {p1:snapshot}
message arrives at P2
(causing event W
) just before P2
's clock ticks over to 09:20
; therefore, as far as P2
is concerned, event W
should be included in the snapshot.
Now when we compare the data in P1
's snapshot with the data in P2
's snapshot, we are unable to explain the cause of event W
— it just pops up out of nowhere because there's a gap in its causal history.
And all of this happens in spite of both processes thinking they took their snapshots at 09:20
.
As we can see, using a machine's time-of-day clock is an error-prone approach to taking consistent snapshots.
So, we need an algorithm that allows us to snapshot an entire system consistently.
We now introduce some new terminology: Channels.
A channel is simply a unidirectional communication path between two processes.
As an aside, the success of the Chandy-Lamport algorithm relies entirely on the truth of the following assumptions:
- A message is guaranteed to arrive at the recipient — eventually
- All channels act as FIFO queues, thus making it impossible for two messages to arrive out of order (I.E. we can exclude the possibility of FIFO anomalies)
- Processes don't crash! (The topic of process failure is dealt with later in lecture 10)
Between any two processes P1
and P2
, two channels exist.
Communication from P1
to P2
passes along channel C12
, and communication in the other direction passes along channel C21
P1
sends a message to P2
at event A
over channel C12
. This is received by P2
at event B
.
P2
then sends a message back to P1
at event C
over channel C21
, but since this message has not yet arrived at P1
, it is said to be in the channel.
In the diagram above, channel C12
is said to be empty because there are no messages currently in flight, and channel C21
contains one message.
In this case, process P1
sends the first message, and therefore we will call it the initiator process.
In the above diagram
- Process
P1
sends a message out at eventA
on its only channelC12
P1
then takes a snapshot of itself (S1
) and immediately sends out a special message called a marker on all its channels. Sending a marker message is actually part of the snapshot algorithm itself and must be the first thing done after a process records its own state. The marker messages themselves do not form part of the snapshot.- Process
P1
then starts to record all the messages it receives on all of its incoming channels
So, what happens when a process receives a marker message?
There are two cases:
-
If this is the first marker this process has seen:
- It takes a snapshot of itself
- It marks the channel on which it received the marker message as empty
- It sends a marker message out on all of its outgoing channels
-
If this process has already seen a marker message before:
- It stops recording all incoming messages on that channel
- It sets that channel's state to be the sum of the events that arrived whilst recording was active
So when P2
receives the marker message from P1
, it immediately records its state (S2
), marks channel C12
as empty, and sends a marker out on all its channels (in this case, P2
only has one channel)
What does P1
do when it sees P2
's marker message?
Well, has P1
seen a marker message before?
Yes, it has. By sending out the first marker message, P1
is said to have seen a marker message.
So now P1
stops recording on the incoming channel on which the marker message arrived.
Did P1
record anything on this channel?
Let's say that it did.
Let's say that the message sent from P2
at event C
actually arrived at P1
as event D
.
So, event D
is now a recorded event and the snapshot is complete:
- We have recorded the state of
P1
inS1
- We have recorded the state of
P2
inS2
- We have recorded the incoming message event
D
inP1
So, the above snapshot forms a complete history of the events that occurred in these two processes; but what about the following two snapshots - do they make sense?
But what about this? Is this a legal snapshot?
Yes, this is perfectly valid - it was just taken prior to event C
in P2
happening.
A snapshot of the entire system is derived by a separate process that goes around collecting all the individual process snapshots and stitching them together to form of overall global snapshot.
The Chandy-Lamport algorithm actually predates the invention of vector clocks. It has been designed to ensure that the entire event history of each process in the system is recorded without any violation of the happens before relation.
Previous | Next |
---|---|
Lecture 6 | Lecture 8 |