Skip to content

Fault Tolerance #1

Open
Open
@fauh45

Description

@fauh45

I know that the code doesn't mentioned, or not really design for fault tolerance. Though I think there's a problem if it's going to be used on a cluster where fault tolerance might be a requirement.

If there's a call as such,
node 1 (count 1) --> check for count --> not equal --> exit
node 2 (count 2) --> check for count --> not equal --> exit
node 3 (count 3) --> (crash)

Note: time denoted by the amount of space in between, with going right means time are advancing.

In this instance, there's no node that will call the callback, thus rendering the process on a stuck state.

I'm still thinking of a solution for this problem, maybe do you have any idea how to make it more fault tolerant?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions