Fault Tolerance

I know that the code doesn't mentioned, or not really design for fault tolerance. Though I think there's a problem if it's going to be used on a cluster where fault tolerance might be a requirement.

If there's a call as such,
node 1 (count 1) --> check for count --> not equal --> exit
     node 2 (count 2) --> check for count --> not equal --> exit
         node 3 (count 3) --> (crash)

Note: time denoted by the amount of space in between, with going right means time are advancing.

In this instance, there's no node that will call the callback, thus rendering the process on a stuck state. 

I'm still thinking of a solution for this problem, maybe do you have any idea how to make it more fault tolerant?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fault Tolerance #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Fault Tolerance #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions