Skip to content

Commit 6fc302b

Browse files
committed
Add MCO-1002 enhancement
1 parent 995b620 commit 6fc302b

File tree

1 file changed

+189
-0
lines changed

1 file changed

+189
-0
lines changed
Lines changed: 189 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,189 @@
1+
---
2+
title: machine-config-non-reconcilable-changes
3+
authors:
4+
- "@pablintino"
5+
reviewers:
6+
- "@yuqi-zhang"
7+
approvers:
8+
- "@yuqi-zhang"
9+
api-approvers:
10+
- "@JoelSpeed"
11+
creation-date: 2025-04-23
12+
tracking-link:
13+
- https://issues.redhat.com/browse/MCO-1002
14+
see-also:
15+
replaces:
16+
superseded-by:
17+
---
18+
19+
# MachineConfig Non-Reconcilable Changes
20+
21+
## Summary
22+
23+
This enhancement describes the context around the MCO validation of MC CRs and
24+
why it will need to be skipped under certain, specific, circumstances. There
25+
are known customer use-cases that requires making non-reconcilable MC changes.
26+
27+
## Motivation
28+
29+
The MCO performs an internal validation of the rendered MC applied to each
30+
pool before applying it to the nodes. The MCO validation process can be split
31+
into three phases:
32+
33+
1. Parse the Ignition raw configuration.
34+
2. Ensure the Ignition configuration is valid.
35+
3. Ensure there are no changes to non-reconcilable fields.
36+
37+
The first two steps are self-explanatory and are not covered by this
38+
enhancement, as the MCO will always perform them. The third one, the
39+
validation of non-reconcilable fields, is the main target of this enhancement.
40+
41+
The rendered MCs are fetched by new nodes when they join, and at that time,
42+
Ignition itself run following the instructions of the rendered MC Ignition.
43+
After the first boot the MCO is in charge of applying supported changes to
44+
nodes, and even if its configuration adheres to the Ignition schema, not all of
45+
the fields are supported after the first boot. The non-reconcilable
46+
validation mechanism alerts the user with a detailed message of the changes in
47+
MCs that are not supported.
48+
As stated, the main limitation behind the configuration validation is that the
49+
MCO does not support the complete Ignition schema thus, the only way for a
50+
user to make changes that the MCO does not support is to recreate the cluster
51+
from scratch.
52+
To avoid trashing and spinning the cluster up from scratch this enhancement
53+
proposes a flag in the MCO MachineConfiguration CR to tell the MCO to skip the
54+
validations and let the MCS serve the Ignition configuration, that can be used
55+
by new nodes. Not supported fields are harmless for the already existing nodes
56+
, that given the new configuration, will only apply changes for the fields the
57+
MCD supports.
58+
59+
### User Stories
60+
61+
* As a cluster admin, I am adding new nodes to a long-standing cluster and I
62+
would like change the partitions for the new nodes.
63+
* As a cluster admin, I am adding new nodes to a long-standing cluster and the
64+
new hardware requires a different set of kernel arguments that I would like
65+
to introduce.
66+
67+
### Goals
68+
69+
* Add a knob in MCO's MachineConfiguration CR to skip non-reconcilable fields
70+
validation if necessary.
71+
72+
### Non-Goals
73+
74+
* Allow invalid Ignition/MachineConfig fields to be applied.
75+
* Disable non-reconcilable MCs validation by default.
76+
77+
## Proposal
78+
79+
Update the MachineConfiguration CR by adding a new field to the spec that
80+
allows users to bypass validation for irreconcileable MachineConfig changes.
81+
The field will default to the current behaviour that is to
82+
validate all rendered MCs.
83+
84+
The MachineConfig Controller and the MachingConfig MachineConfigDaemons will
85+
read in runtime the new field and if the value explicitely states that the
86+
validation should be skipped they will let the MC pass and get applied to
87+
the nodes.
88+
89+
MachineConfig daemons will continue to perform the already supported updates
90+
to nodes, no matter if the non-reconcilable validation is skipped or not.
91+
Already existing nodes that receives only non-supported changes will skip the
92+
update and will be considered updated.
93+
94+
### Workflow Description
95+
96+
When a MCO user modifies the cluster-wide MachineConfigs a new rendered
97+
MachineConfig CR is created for each pool that has an association with the
98+
created, modified or deleted MachineConfig. The rendered MachineConfig, before
99+
being applied, is validated against the Ignition Schema.
100+
101+
After the Ignition validation is done, the non-reconcilable fields validation
102+
is performed or skipped based on the proposed
103+
`machineConfigurationValidationPolicy` field in the MachineConfiguration CR.
104+
If the field is set to `Relaxed` the non-reconcilable fields validation is
105+
skipped performed, otherwise is done.
106+
107+
The non-reconcilable MC validation remains as it is with this enhancement, as
108+
the [implementation](https://github.com/openshift/machine-config-operator/blob/e44d380686aee42f784a277236dbac49b083441e/pkg/controller/common/reconcile.go#L69)
109+
does not change with this enhancement.
110+
111+
### API Extensions
112+
113+
- Update the MachineConfiguration CRD to add an enumeration field, called
114+
`machineConfigurationValidationPolicy` that is used as the validation
115+
skipping toggle. The field does not set a default values to let the MCO pick
116+
what to do in the default case. The enumeration has only two values:
117+
- Strict: Validation is always performed. This is the value the MCO will
118+
use as default.
119+
- Relaxed: The validation of non reconcilable fields is skipped and only
120+
the Ignition syntactic validation will be done.
121+
122+
### Risks and Mitigations
123+
124+
By setting `machineConfigurationValidationPolicy` to `Relaxed` the customer
125+
acknowledges that providing MCs that make use of Ignition features out of the
126+
scope of the MCO will lead to cluster with nodes using different Ignition
127+
configurations.
128+
129+
### Drawbacks
130+
131+
None.
132+
133+
## Design Details
134+
135+
### Open Questions [optional]
136+
137+
None.
138+
139+
### Test Plan
140+
141+
MCO e2e tests and unit tests will cover this functionality.
142+
143+
### Graduation Criteria
144+
145+
This feature is behind the tech-preview FeatureGate in 4.20.
146+
Once it is tested by QE and users it can be GA'd since it should not impact
147+
daily usage of a cluster.
148+
149+
## Dev Preview -> Tech Preview
150+
151+
Not applicable. Feature introduced in Tech Preview.
152+
153+
## Tech Preview -> GA
154+
155+
Bugs found by e2e tests and QE are .
156+
157+
#### Removing a deprecated feature
158+
159+
### Upgrade / Downgrade Strategy
160+
161+
Upgrades or downgrades are not impacted by the presence or not of this feature.
162+
163+
### Version Skew Strategy
164+
165+
Not applicable.
166+
167+
### Operational Aspects of API Extensions
168+
169+
#### Failure Modes
170+
171+
If the non-reconcilable configuration validation is performed and it fails
172+
the MCO continues to report the failure as it is alraedy doing in the MCP, by
173+
setting to the MCP the `RenderDegraded` condition to true.
174+
175+
If the configuration reaches the MCD and the non-reconcilable validation
176+
fails the MCN `UpdatePrepared` condition is updated with the details of the
177+
validation failure.
178+
179+
#### Support Procedures
180+
181+
None.
182+
183+
## Implementation History
184+
185+
Not applicable.
186+
187+
## Alternatives (Not Implemented)
188+
189+
Not applicable.

0 commit comments

Comments
 (0)