Skip to content

Commit 1e5aee2

Browse files
committed
Improve: Ensure uniform membership config during config changes
This commit improves the handling of membership configuration changes to ensure that the system reliably transitions to a uniform configuration in the second step of a joint membership change. ### Problem: During a membership config change, there are two steps: 1. Transition to a joint configuration containing both `Cold` (current config) and `Cnew` (new config). 2. Transition to a uniform configuration containing only `Cnew`. Previously, the second step attempted to apply the same change on the current membership state. If multiple membership change requests were processed in parallel, this approach could result in the system being left in a joint configuration. For example: - Initial config: `{a, b, c}`. - Task 1: `AddVoterIds(x)`. After the first step: `[{a, b, c}, {a, b, c, x}]`. - Task 2: `RemoveVoters(x)`. After the first step: `[{a, b, c, x}, {a, b, c}]` (applied on the last config `{a, b, c, x}`). - Task 1 proceeds to the second step, re-applies `AddVoterIds(x)`, and the result is `[{a, b, c}, {a, b, c, x}]`. - The system remains in a joint configuration, which is unintuitive and contradicts standard Raft expectations. ### Solution: The second step now applies an **empty change request** (`AddVoterIds({})`) to the last configuration in the current joint config. This ensures that the system always transitions to a uniform configuration in the second step. ### Impact: - No behavior changes occur if only one membership change request is in progress. - If multiple requests are processed concurrently, the application must still verify the result, and the new behavior ensures the system transitions to a uniform state. This fix prevents the system from being left in an unintended joint configuration, improving consistency and adherence to Raft principles. Thanks to @tvsfx for providing feedback on this issue and offering a detailed explanation of the solution.
1 parent 30f1ae6 commit 1e5aee2

File tree

4 files changed

+64
-1
lines changed

4 files changed

+64
-1
lines changed
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# Joint Consensus in OpenRaft
2+
3+
## Introduction
4+
5+
Joint consensus is a mechanism in the Raft protocol that allows for safe changes to cluster membership. When membership changes occur (such as adding or removing nodes), the cluster transitions through a joint configuration phase where both old and new configurations are considered valid. This ensures that the cluster maintains availability and safety during transitions.
6+
7+
## Membership Change Process
8+
9+
OpenRaft implements membership changes using a two-phase joint consensus approach:
10+
11+
1. **First Phase:** Apply a membership change request (e.g., `AddVoters({x})`) to the current configuration, forming a joint configuration that contains both the old and new configurations. This joint configuration is then committed.
12+
13+
2. **Second Phase:** Apply an **empty change request** (`AddVoterIds({})`) to the last configuration in the current joint config. This transitions the cluster from joint configuration to a uniform configuration and is then committed.
14+
15+
## Problem in Previous Versions of OpenRaft (prior to 2025-04-03)
16+
17+
In earlier implementations, the second phase of a membership change would reapply the original change operation to the current membership state. When multiple membership change requests were processed concurrently, this approach could leave the system in a joint configuration state rather than transitioning to a uniform configuration. Consider this example:
18+
19+
- Initial configuration: `{a, b, c}`
20+
- Task 1: `AddVoterIds(x)`
21+
- After first phase: `[{a, b, c}, {a, b, c, x}]`
22+
- Task 2: `RemoveVoters(x)` (runs concurrently)
23+
- After first phase: `[{a, b, c, x}, {a, b, c}]` (applied to the last configuration `{a, b, c, x}`)
24+
- Task 1 proceeds to second phase, reapplies `AddVoterIds(x)` to the current state
25+
- Result: `[{a, b, c}, {a, b, c, x}]` (still a joint configuration)
26+
27+
This behavior was problematic because:
28+
1. The system remained in a joint configuration state indefinitely
29+
2. This contradicted the standard Raft expectation that membership changes should eventually result in a uniform configuration
30+
3. It created confusion for users who expected membership changes to be fully applied
31+
32+
## Solution
33+
34+
The second step now applies an **empty change request** (`AddVoterIds({})`) to the last configuration in the current joint config. This ensures that the system always transitions to a uniform configuration in the second step, regardless of concurrent membership operations.
35+
36+
## Impact
37+
38+
- **Single Change Request:** No behavior changes occur if only one membership change request is in progress.
39+
- **Concurrent Requests:** If multiple requests are processed concurrently, the application must still verify the result, but the new behavior ensures the system always transitions to a uniform state.
40+
41+
## Acknowledgments
42+
43+
Thanks to @tvsfx for providing feedback on this issue and offering a detailed explanation of the solution.

openraft/src/docs/cluster_control/mod.rs

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,10 @@ pub mod dynamic_membership {
88
#![doc = include_str!("dynamic-membership.md")]
99
}
1010

11+
pub mod joint_consensus {
12+
#![doc = include_str!("joint-consensus.md")]
13+
}
14+
1115
pub mod node_lifecycle {
1216
#![doc = include_str!("node-lifecycle.md")]
1317
}

openraft/src/docs/docs.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ To maintain an Openraft cluster, e.g., add or remove nodes, refer to
88
- [`cluster_control`](crate::docs::cluster_control) :
99
- [`cluster_formation`](`crate::docs::cluster_control::cluster_formation`) describes how to form a cluster;
1010
- [`dynamic membership`](`crate::docs::cluster_control::dynamic_membership`) describes how to add or remove nodes without downtime;
11+
- [`joint_consensus`](`crate::docs::cluster_control::joint_consensus`) describes detail of joint consensus implementation;
1112
- [`node lifecycle`](`crate::docs::cluster_control::node_lifecycle`) describes the transition of a node's state;
1213

1314
When upgrading an Openraft application, consult:

openraft/src/raft/impl_raft_blocking_write.rs

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,9 @@ where C: RaftTypeConfig<Responder = OneshotResponder<C>>
3131
/// - It proposes a **joint** config.
3232
/// - When the **joint** config is committed, it proposes a uniform config.
3333
///
34+
/// Read more about the behavior of [joint
35+
/// consensus](crate::docs::cluster_control::joint_consensus).
36+
///
3437
/// If `retain` is `true`, then all the members which not exists in the new membership,
3538
/// will be turned into learners, otherwise will be removed.
3639
/// If `retain` is `false`, the removed voter will be removed from the cluster.
@@ -96,7 +99,19 @@ where C: RaftTypeConfig<Responder = OneshotResponder<C>>
9699

97100
let (tx, rx) = oneshot_channel::<C>();
98101

99-
let res = self.inner.call_core(RaftMsg::ChangeMembership { changes, retain, tx }, rx).await;
102+
// The second step, send a NOOP change to flatten the joint config.
103+
104+
let res = self
105+
.inner
106+
.call_core(
107+
RaftMsg::ChangeMembership {
108+
changes: ChangeMembers::AddVoterIds(Default::default()),
109+
retain,
110+
tx,
111+
},
112+
rx,
113+
)
114+
.await;
100115

101116
if let Err(e) = &res {
102117
tracing::error!("the second step error: {}", e);

0 commit comments

Comments
 (0)