You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is an error in design in the mencius message processing logic.
`select {
case propose := <-r.ProposeChan:
//got a Propose from a client
dlog.Printf("Proposal with id %d\n", propose.CommandId)
r.handlePropose(propose)
break
case skipS := <-r.skipChan:
skip := skipS.(*menciusproto.Skip)
//got a Skip from another replica
dlog.Printf("Skip for instances %d-%d\n", skip.StartInstance, skip.EndInstance)
r.handleSkip(skip)
case prepareS := <-r.prepareChan:
prepare := prepareS.(*menciusproto.Prepare)
//got a Prepare message
dlog.Printf("Received Prepare from replica %d, for instance %d\n", prepare.LeaderId, prepare.Instance)
r.handlePrepare(prepare)
break
case acceptS := <-r.acceptChan:
accept := acceptS.(*menciusproto.Accept)
//got an Accept message
dlog.Printf("Received Accept from replica %d, for instance %d\n", accept.LeaderId, accept.Instance)
r.handleAccept(accept)
break
case commitS := <-r.commitChan:
commit := commitS.(*menciusproto.Commit)
//got a Commit message
dlog.Printf("Received Commit from replica %d, for instance %d\n", commit.LeaderId, commit.Instance)
r.handleCommit(commit)
break
case prepareReplyS := <-r.prepareReplyChan:
prepareReply := prepareReplyS.(*menciusproto.PrepareReply)
//got a Prepare reply
dlog.Printf("Received PrepareReply for instance %d\n", prepareReply.Instance)
r.handlePrepareReply(prepareReply)
break
case acceptReplyS := <-r.acceptReplyChan:
acceptReply := acceptReplyS.(*menciusproto.AcceptReply)
//got an Accept reply
dlog.Printf("Received AcceptReply for instance %d\n", acceptReply.Instance)
r.handleAcceptReply(acceptReply)
break`
In Mencius, each node should have FIFO channels, which is correctly implemented in this implementation. However, upon receiving a message from a node, that message is pushed to a channel that is specific to that message type. Then the messages are processed in the receiver side in non-FIFO method. The following is an example where this design approach breaks safety.
Assume that there are 3 nodes; A, B and C. Node A first sends a Accept message and then later sends a Propose message. Now both these messages are received by B in the order sent by A. However, upon receiving the two messages, Node B will push these messages to two separate queues. Another thread scans each channel using a select polling mechanism.
Now there is a violation of the protocol if the Propose message is first processed by B (which is possible in this design). This is a problem in mencius because, from messages each node derives piggy backed messages, hence the order of processing messages should be strictly similar to the sender's order.
A fix for this would be to have a single channel for each type of replica messages.
Thanks
The text was updated successfully, but these errors were encountered:
There is an error in design in the mencius message processing logic.
`select {
In Mencius, each node should have FIFO channels, which is correctly implemented in this implementation. However, upon receiving a message from a node, that message is pushed to a channel that is specific to that message type. Then the messages are processed in the receiver side in non-FIFO method. The following is an example where this design approach breaks safety.
Assume that there are 3 nodes; A, B and C. Node A first sends a Accept message and then later sends a Propose message. Now both these messages are received by B in the order sent by A. However, upon receiving the two messages, Node B will push these messages to two separate queues. Another thread scans each channel using a select polling mechanism.
Now there is a violation of the protocol if the Propose message is first processed by B (which is possible in this design). This is a problem in mencius because, from messages each node derives piggy backed messages, hence the order of processing messages should be strictly similar to the sender's order.
A fix for this would be to have a single channel for each type of replica messages.
Thanks
The text was updated successfully, but these errors were encountered: