-
Notifications
You must be signed in to change notification settings - Fork 541
MCO-1002: Add a flag to allow irreconcilable configs #2244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@pablintino: This pull request references MCO-1002 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.19.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Hello @pablintino! Some important instructions when contributing to openshift/api: |
// ignoreIrreconcilableConfig tells the operator to ignore irreconciliable configuration changes | ||
// in already existing nodes. New nodes joining the cluster will see the newest configuration. | ||
// +optional | ||
IgnoreIrreconciliableConfig bool `json:"ignoreIrreconcilableConfig"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not use a boolean here. Things generally start as booleans but can end up progressing to need more than a true
/false
option. Make this an enum instead so that you can add additional options in the future if needed. Using a boolean makes it so you cannot change this to support other options in the future.
For more information on why we encourage enums over booleans see: https://github.com/openshift/enhancements/blob/master/dev-guide/api-conventions.md#do-not-use-boolean-fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, this is being added to a v1 API - should this be feature-gated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@everettraven I totally agree that, what you propose is nicer and more maintainable. I've updated the PR to use a proper enum with a default value that means "behave like you are doing, aka, validate the configs".
About the feature gate, @yuqi-zhang thought, and I agree with him that this shouldn't require a feature gate, as it's a knob to allow some customers (previously informed about the implications of using the non-default value) to skip the validation of the MCO config for new nodes. I let him reply when he is back from PTO in 3 weeks, as there's no rush with this PR.
6fd81f1
to
aaf341e
Compare
// Valid values are Strict and Relaxed: | ||
// Strict: Rejects changes to MachineConfigs if fields that doesn't support to be updated are changed. | ||
// Relaxed: Changes to protected fields are allowed and will be applied in new nodes joining the cluster. | ||
// +kubebuilder:validation:Enum=Strict;Relaxed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create constants for the valid types.
Also, because this is an optional field, ""
is a valid value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks!
// +kubebuilder:validation:Default=Strict | ||
// +default="Strict" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really want to set an explicit default here? Will you always and forever default to Strict
?
A common pattern when it comes to defaulting for enums in configuration type APIs is to have the system choose the default. This allows us to change it as we see fit, where setting an explicit default does not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with what you propose. I've added the empty string as an option and I'll code the MCO to consider the empty string as a valid input.
@@ -56,6 +58,17 @@ type MachineConfigurationSpec struct { | |||
// +openshift:enable:FeatureGate=NodeDisruptionPolicy | |||
// +optional | |||
NodeDisruptionPolicy NodeDisruptionPolicyConfig `json:"nodeDisruptionPolicy"` | |||
|
|||
|
|||
// configurationValidationPolicy tells the operator how new machine configurations should be validated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// configurationValidationPolicy tells the operator how new machine configurations should be validated. | |
// configurationValidationPolicy is an optional field that allows configuring the level of validation performed on new machine configurations. |
Is this only done on new machine configurations or does this also apply to changes to existing machine configurations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It applies only to new changes. Basically, what the MCO does is to gather all the machine configurations for each pool of nodes, merge them into a single one, called rendered, and perform the validations in the rendered.
This toggle allows the user to skip checking if there are changes between the "current" and the "new" rendered MCs. Sounds weird, but there's an explanation and some use cases the justifies this need. The main one is a customer that deployed the cluster using X filesystem configuration 4 years ago. Now, they want to add new nodes but that X FS config is no longer valid cause the HW is different. With our current approach we will validate the new rendered MC and we will reject it since changes in the FS sections are not allowed. With this option, the customer, will ack that we are not making validations and that their configs may be problematic, but they will be able to deploy and scenario like the one I described. New nodes will take the latest MC and old nodes will just ignore the changes.
// +kubebuilder:validation:Default=Strict | ||
// +default="Strict" | ||
// +optional | ||
ConfigurationValidationPolicy MachineConfigurationValidationPolicy `json:"configurationValidationPolicy,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is MachineConfigurationValidationPolicy
more descriptive for a user as to what this applies to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed!
|
||
|
||
// configurationValidationPolicy tells the operator how new machine configurations should be validated. | ||
// Valid values are Strict and Relaxed: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally follow this format for stating allowed enum values to try and keep consistent across OCP APIs:
// Valid values are Strict and Relaxed: | |
// Allowed values are Strict, Relaxed, and omitted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, add a description for what happens when leaving this omitted, like:
- If you are using a system chosen default (i.e the operator chooses):
When omitted, this means no-opinion and the system is left to choose a default. Currently the default is {default}.
- If you are using an explicit default (i.e defaulted on admission):
When omitted, defaults to {default}.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for both comments. I think the best is whay you proposed in another comment, let the system decide which option is the default one.
|
||
// configurationValidationPolicy tells the operator how new machine configurations should be validated. | ||
// Valid values are Strict and Relaxed: | ||
// Strict: Rejects changes to MachineConfigs if fields that doesn't support to be updated are changed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally follow this format on OpenShift APIs when talking about what happens when setting a specific allowed value:
// Strict: Rejects changes to MachineConfigs if fields that doesn't support to be updated are changed. | |
// When set to Strict, changes to MachineConfigs fields that doesn't support to be updated are rejected. |
What does "doesn't support to be updated" mean? are these the "protected" fields you mention in the description of the Relaxed setting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a bit more information in the new patch, including a link to our docs.
I think the new comment in the new patch may express what I mean a bit better.
// configurationValidationPolicy tells the operator how new machine configurations should be validated. | ||
// Valid values are Strict and Relaxed: | ||
// Strict: Rejects changes to MachineConfigs if fields that doesn't support to be updated are changed. | ||
// Relaxed: Changes to protected fields are allowed and will be applied in new nodes joining the cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Relaxed: Changes to protected fields are allowed and will be applied in new nodes joining the cluster. | |
// When set to Relaxed, changes to protected fields are allowed and will be applied in new nodes joining the cluster. |
aaf341e
to
5655594
Compare
@pablintino: This pull request references MCO-1002 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.19.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@everettraven Thanks for your inputs Bryce. I've updated the patch to match your input. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: pablintino The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
1 similar comment
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: pablintino The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Is there an appropriate enhancement that describes how this is going to work? And in particular I'm concerned about how we will support this? Is this field supportable? This will have to go behind a feature gate at first as well |
3b81d2a
to
ef71013
Compare
Hi @JoelSpeed. I've updated the patch, adding a feature gate and referencing to the, still draft, enhancement I created. |
ef71013
to
faf46db
Compare
The flag will tell the operator to skip the irreconcilable fields validation and let the user update/patch conflictive MCs in the cluster at his own risk. This feature is specially thought to allow users to add new nodes in a long standing cluster with newer configuration.
faf46db
to
5fb3a30
Compare
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@pablintino: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
The flag will tell the operator to skip the irreconcilable fields validation and let the user update/patch conflictive MCs in the cluster at his own risk. This feature is specially thought to allow users to add new nodes in a long standing cluster with newer configuration.