Skip to content

Commit 81d5080

Browse files
Merge pull request onflow#5905 from onflow/alex/consensus-speedup-parameters
New consensus parameters for Crescendo
2 parents ae7024d + e11cd1a commit 81d5080

File tree

13 files changed

+231
-96
lines changed

13 files changed

+231
-96
lines changed

cmd/consensus/main.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -145,7 +145,7 @@ func main() {
145145
flags.DurationVar(&maxInterval, "max-interval", 90*time.Second, "the maximum amount of time between two blocks")
146146
flags.UintVar(&maxSealPerBlock, "max-seal-per-block", 100, "the maximum number of seals to be included in a block")
147147
flags.UintVar(&maxGuaranteePerBlock, "max-guarantee-per-block", 100, "the maximum number of collection guarantees to be included in a block")
148-
flags.DurationVar(&hotstuffMinTimeout, "hotstuff-min-timeout", 2500*time.Millisecond, "the lower timeout bound for the hotstuff pacemaker, this is also used as initial timeout")
148+
flags.DurationVar(&hotstuffMinTimeout, "hotstuff-min-timeout", 1045*time.Millisecond, "the lower timeout bound for the hotstuff pacemaker, this is also used as initial timeout")
149149
flags.Float64Var(&hotstuffTimeoutAdjustmentFactor, "hotstuff-timeout-adjustment-factor", timeout.DefaultConfig.TimeoutAdjustmentFactor, "adjustment of timeout duration in case of time out event")
150150
flags.Uint64Var(&hotstuffHappyPathMaxRoundFailures, "hotstuff-happy-path-max-round-failures", timeout.DefaultConfig.HappyPathMaxRoundFailures, "number of failed rounds before first timeout increase")
151151
flags.DurationVar(&cruiseCtlFallbackProposalDurationFlag, "cruise-ctl-fallback-proposal-duration", cruiseCtlConfig.FallbackProposalDelay.Load(), "the proposal duration value to use when the controller is disabled, or in epoch fallback mode. In those modes, this value has the same as the old `--block-rate-delay`")

consensus/hotstuff/cruisectl/README.md

Lines changed: 54 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,6 @@ The process variable is the variable which:
3737
---
3838
👉 The `BlockTimeController` controls the progression through views, such that the epoch switchover happens at the intended point in time. We define:
3939

40-
- $\gamma = k\cdot \tau_0$ is the remaining epoch duration of a hypothetical ideal system, where *all* remaining $k$ views of the epoch progress with the ideal view time $\tau_0$.
4140
- $\gamma = k\cdot \tau_0$ is the remaining epoch duration of a hypothetical ideal system, where *all* remaining $k$ views of the epoch progress with the ideal view time $\tau_0$.
4241
- The parameter $\tau_0$ is computed solely based on the Epoch configuration as
4342
$\tau_0 := \frac{<{\rm total\ epoch\ time}>}{<{\rm total\ views\ in\ epoch}>}$ (for mainnet 22, Epoch 75, we have $\tau_0 \simeq$ 1250ms).
@@ -62,30 +61,32 @@ The desired idealized system behaviour would a constant view duration $\tau_0$ t
6261

6362
However, in the real-world system we have disturbances (varying message relay times, slow or offline nodes, etc) and measurement uncertainty (node can only observe its local view times, but not the committee’s collective swarm behaviour).
6463

65-
![](/docs/CruiseControl_BlockTimeController/PID_controller_for_block-rate-delay.png)
64+
<img src='https://github.com/onflow/flow-go/blob/master/docs/CruiseControl_BlockTimeController/PID_controller_for_block-rate-delay.png' width='600'>
65+
6666

6767
After a disturbance, we want the controller to drive the system back to a state, where it can closely follow the ideal behaviour from there on.
6868

6969
- Simulations have shown that this approach produces *very* stable controller with the intended behaviour.
70-
70+
7171
**Controller driving $e := \gamma - \Gamma \rightarrow 0$**
7272
- setting the differential term $K_d=0$, the controller responds as expected with damped oscillatory behaviour
7373
to a singular strong disturbance. Setting $K_d=3$ suppresses oscillations and the controller's performance improves as it responds more effectively.
7474

75-
![](/docs/CruiseControl_BlockTimeController/EpochSimulation_029.png)
76-
![](/docs/CruiseControl_BlockTimeController/EpochSimulation_030.png)
77-
75+
<img src='https://github.com/onflow/flow-go/blob/master/docs/CruiseControl_BlockTimeController/EpochSimulation_029.png' width='900'>
76+
77+
<img src='https://github.com/onflow/flow-go/blob/master/docs/CruiseControl_BlockTimeController/EpochSimulation_030.png' width='900'>
78+
7879
- controller very quickly compensates for moderate disturbances and observational noise in a well-behaved system:
7980

80-
![](/docs/CruiseControl_BlockTimeController/EpochSimulation_028.png)
81-
81+
<img src='https://github.com/onflow/flow-go/blob/master/docs/CruiseControl_BlockTimeController/EpochSimulation_028.png' width='900'>
82+
8283
- controller compensates massive anomaly (100s network partition) effectively:
8384

84-
![](/docs/CruiseControl_BlockTimeController/EpochSimulation_000.png)
85-
85+
<img src='https://github.com/onflow/flow-go/blob/master/docs/CruiseControl_BlockTimeController/EpochSimulation_000.png' width='900'>
86+
8687
- controller effectively stabilizes system with continued larger disturbances (20% of offline consensus participants) and notable observational noise:
8788

88-
![](/docs/CruiseControl_BlockTimeController/EpochSimulation_005-0.png)
89+
<img src='https://github.com/onflow/flow-go/blob/master/docs/CruiseControl_BlockTimeController/EpochSimulation_005-0.png' width='900'>
8990
9091
**References:**
9192

@@ -111,15 +112,15 @@ $\tau_0 := \frac{<{\rm total\ epoch\ time}>}{<{\rm total\ views\ in\ epoch}>}$
111112

112113
- remaining views of the epoch $k[v] := F[v] +1 - v$
113114
- time remaining until the desired epoch switchover $\Gamma[v] := T[v]-t[v]$
114-
- error $e[v] := \underbrace{k\cdot\tau_0}_{\gamma[v]} - \Gamma[v] = t[v] + k\cdot\tau_0 - T[v]$
115+
- error $e[v] := \underbrace{k\cdot\tau_0}_{\gamma[v]} - \Gamma[v] = t[v] + k[v] \cdot\tau_0 - T[v]$
115116

116117
### Precise convention of View Timing
117118

118119
Upon observing block `B` with view $v$, the controller updates its internal state.
119120

120121
Note the '+1' term in the computation of the remaining views $k[v] := F[v] +1 - v$ . This is related to our convention that the epoch begins (happy path) when observing the first block of the epoch. Only by observing this block, the nodes transition to the first view of the epoch. Up to that point, the consensus replicas remain in the last view of the previous epoch, in the state of `having processed the last block of the old epoch and voted for it` (happy path). Replicas remain in this state until they see a confirmation of the view (either QC or TC for the last view of the previous epoch).
121122

122-
![](/docs/CruiseControl_BlockTimeController/ViewDurationConvention.png)
123+
<img src='https://github.com/onflow/flow-go/blob/master/docs/CruiseControl_BlockTimeController/ViewDurationConvention.png' width='600'>
123124

124125
In accordance with this convention, observing the proposal for the last view of an epoch, marks the start of the last view. By observing the proposal, nodes enter the last view, verify the block, vote for it, the primary aggregates the votes, constructs the child (for first view of new epoch). The last view of the epoch ends, when the child proposal is published.
125126

@@ -180,11 +181,11 @@ In particular systematic observation bias are a problem, as it leads to a diverg
180181
```math
181182
\eqalign{
182183
\textnormal{initialization: }\quad \bar{\mathcal{I}} :&= 0 \\
183-
\textnormal{update with instantaneous error\ } e[v]:\quad \bar{\mathcal{I}}[v] &= e[v] + (1-\beta)\cdot\bar{\mathcal{I}}[v-1]
184+
\textnormal{update with instantaneous error\ } e[v]:\quad \bar{\mathcal{I}}[v] &= e[v] + (1-\lambda)\cdot\bar{\mathcal{I}}[v-1]
184185
}
185186
```
186187

187-
Intuitively, the loss factor $\beta$ relates to the time window of the integrator. A factor of 0 means an infinite time horizon, while $\beta =1$ makes the integrator only memorize the last input. Let $\beta \equiv \frac{1}{N_\textnormal{itg}}$ and consider a constant input value $x$. Then $N_\textnormal{itg}$ relates to the number of past samples that the integrator remembers:
188+
Intuitively, the loss factor $\lambda$ relates to the time window of the integrator. A factor of 0 means an infinite time horizon, while $\lambda =1$ makes the integrator only memorize the last input. Let $\lambda \equiv \frac{1}{N_\textnormal{itg}}$ and consider a constant input value $x$. Then $N_\textnormal{itg}$ relates to the number of past samples that the integrator remembers:
188189

189190
- the integrators output will saturate at $x\cdot N_\textnormal{itg}$
190191
- an integrator initialized with 0, reaches 2/3 of the saturation value $x\cdot N_\textnormal{itg}$ after consuming $N_\textnormal{itg}$ inputs
@@ -221,63 +222,77 @@ with parameters:
221222
- $K_i = 0.6$
222223
- $K_d = 3.0$
223224
- $N_\textnormal{ewma} = 5$, i.e. $\alpha = \frac{1}{N_\textnormal{ewma}} = 0.2$
224-
- $N_\textnormal{itg} = 50$, i.e. $\beta = \frac{1}{N_\textnormal{itg}} = 0.02$
225+
- $N_\textnormal{itg} = 50$, i.e. $\lambda = \frac{1}{N_\textnormal{itg}} = 0.02$
225226

226227
The controller output $u[v]$ represents the amount of time by which the controller wishes to deviate from the ideal view duration $\tau_0$. In other words, the duration of view $v$ that the controller wants to set is
227228
```math
228229
\widehat{\tau}[v] = \tau_0 - u[v]
229230
```
230231
---
231232

233+
### Limits of authority
232234

233-
For further details about
234-
235-
- the statistical model of the view duration, see [ID controller for ``block-rate-delay``](https://www.notion.so/ID-controller-for-block-rate-delay-cc9c2d9785ac4708a37bb952557b5ef4?pvs=21)
236-
- the simulation and controller tuning, see [flow-internal/analyses/pacemaker_timing/2023-05_Blocktime_PID-controller](https://github.com/dapperlabs/flow-internal/tree/master/analyses/pacemaker_timing/2023-05_Blocktime_PID-controller)[controller_tuning_v01.py](https://github.com/dapperlabs/flow-internal/blob/master/analyses/pacemaker_timing/2023-05_Blocktime_PID-controller/controller_tuning_v01.py)
237-
238-
### Limits of authority
235+
[Latest update: Crescendo Upgrade, June 2024]
239236

240237
In general, there is no bound on the output of the controller output $u$. However, it is important to limit the controller’s influence to keep $u$ within a sensible range.
241238

242239
- upper bound on view duration $\widehat{\tau}[v]$ that we allow the controller to set:
243240

244-
The current timeout threshold is set to 2.5s. Therefore, the largest view duration we want to allow the controller to set is 1.6s.
245-
Thereby, approx. 900ms remain for message propagation, voting and constructing the child block, which will prevent the controller to drive the node into timeout with high probability.
246-
241+
The current timeout threshold is set to 1045ms and the largest view duration we want to allow the controller to set is $\tau_\textrm{max}$ = 910ms.
242+
Thereby, we have a buffer $\beta$ = 135ms remaining for message propagation and the replicas validating the proposal for view $v$.
243+
244+
Note the subtle but important aspect: Primary for view $v$ controls duration of view $v-1$. This is because its proposal for view $v$
245+
contains the proof (Quorum Certificate [QC]) that view $v-1$ concluded on the happy path. By observing the QC for view $v-1$, nodes enter the
246+
subsequent view $v$.
247+
248+
247249
- lower bound on the view duration:
248250

249-
Let $t_\textnormal{p}[v]$ denote the time when the primary for view $v$ has constructed its block proposal.
250-
The time difference $t_\textnormal{p}[v] - t[v]$ between the primary entering the view and having its proposal
251-
ready is the minimally required time to execute the protocol. The controller can only *delay* broadcasting the block,
251+
Let $t_\textnormal{p}[v]$ denote the time when the primary for view $v$ has constructed its block proposal.
252+
On the happy path, a replica concludes view $v-1$ and transitions to view $v$, when it observes the proposal for view $v$.
253+
The duration $t_\textnormal{p}[v] - t[v-1]$ is the time between the primary observing the parent block (view $v-1$), collecting votes,
254+
constructing a QC for view $v-1$, and subsequently its own proposal for view $v$. This duration is the minimally required time to execute the protocol.
255+
The controller can only *delay* broadcasting the block,
252256
but it cannot release the block before $t_\textnormal{p}[v]$ simply because the proposal isn’t ready any earlier.
253257

254258

255259

256260
👉 Let $\hat{t}[v]$ denote the time when the primary for view $v$ *broadcasts* its proposal. We assign:
257261

258262
```math
259-
\hat{t}[v] := \max\big(t[v] +\min(\widehat{\tau}[v],\ 2\textnormal{s}),\ t_\textnormal{p}[v]\big)
263+
\hat{t}[v] := \max\Big(t[v-1] +\min(\widehat{\tau}[v-1],\ \tau_\textrm{max}),\ t_\textnormal{p}[v]\Big)
260264
```
265+
This equation guarantees that the controller does not drive consensus into a timeout, as long as broadcasting the block and its validation
266+
together require less than time $\beta$. Currently, we have $\tau_\textrm{max}$ = 910ms as the upper bound for view durations that the controller can set.
267+
In comparison, for HotStuff's timeout threshold we set $\texttt{hotstuff-min-timeout} = \tau_\textrm{max} + \beta$, with $\beta$ = 135ms.
261268

262269

263270

264-
## Edge Cases
271+
### Further reading
265272

266-
### A node is catching up
273+
- the statistical model of the view duration, see [PID controller for ``block-rate-delay``](https://www.notion.so/ID-controller-for-block-rate-delay-cc9c2d9785ac4708a37bb952557b5ef4?pvs=21)
274+
- the simulation and controller tuning, see [flow-internal/analyses/pacemaker_timing/2023-05_Blocktime_PID-controller](https://github.com/dapperlabs/flow-internal/tree/master/analyses/pacemaker_timing/2023-05_Blocktime_PID-controller)[controller_tuning_v01.py](https://github.com/dapperlabs/flow-internal/blob/master/analyses/pacemaker_timing/2023-05_Blocktime_PID-controller/controller_tuning_v01.py)
275+
- The most recent parameter setting was derived here:
276+
- [Cruise-Control headroom for speedups](https://www.notion.so/flowfoundation/Cruise-Control-headroom-for-speedups-46dc17e07ae14462b03341e4432a907d?pvs=4) contains the formal analysis and discusses the numerical results in detail
277+
- Python code for figures and calculating the final parameter settings: [flow-internal/analyses/pacemaker_timing/2024-03_Block-timing-update](https://github.com/dapperlabs/flow-internal/tree/master/analyses/pacemaker_timing/2024-03_Block-timing-update)[timeout-attacks.py](https://github.com/dapperlabs/flow-internal/blob/master/analyses/pacemaker_timing/2024-03_Block-timing-update/timeout-attacks.py)
267278

268-
When a node is catching up, it processes blocks more quickly than when it is up-to-date, and therefore observes a faster view rate. This would cause the node’s `BlockRateManager` to compensate by increasing the block rate delay.
269279

270-
As long as delay function is responsive, it doesn’t have a practical impact, because nodes catching up don’t propose anyway.
280+
## Edge Cases
281+
282+
### A node is catching up
271283

272-
To the extent the delay function is not responsive, this would cause the block rate to slow down slightly, when the node is caught up.
284+
When a node is catching up, it observes the blocks significantly later than they were published. In other words, from the perspective
285+
of the node catching up, the blocks are too late. However, as it reaches the most recent blocks, also the observed timing error approaches zero
286+
(assuming approximately correct block publication by the honest supermajority). Nevertheless, due to its biased error observations, the node
287+
catching up could still try to compensate for the network being behind, and publish its proposal as early as possible.
273288

274-
**Assumption:** as we assume that only a smaller fraction of nodes go offline, the effect is expected to be small and easily compensated for by the supermajority of online nodes.
289+
**Assumption:** With only a smaller fraction of nodes being offline or catching up, the effect is expected to be small and easily compensated for by the supermajority of online nodes.
275290

276291
### A node has a misconfigured clock
277292

278293
Cap the maximum deviation from the default delay (limits the general impact of error introduced by the `BlockTimeController`). The node with misconfigured clock will contribute to the error in a limited way, but as long as the majority of nodes have an accurate clock, they will offset this error.
279294

280-
**Assumption:** few enough nodes will have a misconfigured clock, that the effect will be small enough to be easily compensated for by the supermajority of correct nodes.
295+
**Assumption:** With only a smaller fraction of nodes having misconfigured clocks, the effect will be small enough to be easily compensated for by the supermajority of correct nodes.
281296

282297
### Near epoch boundaries
283298

@@ -287,7 +302,9 @@ We might incorrectly compute high error in the target view rate, if local curren
287302

288303
### EFM
289304

290-
We need to detect EFM and revert to a default block-rate-delay (stop adjusting).
305+
When the network is in EFM, epoch timing is anyway disrupted. The main thing we want to avoid is that the controller drives consensus into a timeout.
306+
This is largely guaranteed, due to the limits of authority. Beyond that, pretty much any block timing on the happy path is acceptable.
307+
Through, the optimal solution would be a consistent view time throughout normal Epochs as well as EFM.
291308

292309
## Testing
293310

0 commit comments

Comments
 (0)