-
Notifications
You must be signed in to change notification settings - Fork 426
MCO-1580: MCO-1581: Achieving parity with MCO node disruption frequency #4996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
b3d97fc
to
a69117f
Compare
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dkhater-redhat The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
a69117f
to
435bbef
Compare
@dkhater-redhat: This pull request references MCO-1580 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
8452b7a
to
32555ad
Compare
ef4dc03
to
621dc6c
Compare
633afa0
to
db06f06
Compare
5281786
to
3347699
Compare
…g and reusing MOSB's with updated rendered spec
3347699
to
0c73a25
Compare
@dkhater-redhat: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
return err | ||
} | ||
|
||
if (oldRendered != newRendered && needsImageRebuild) || firstOptIn == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There can be a case where the rendered MC hasn't changed but a rebuild is still needed - for example when the rebuild
annotation is applied. I think this should be an OR check instead of AND.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah there is a flaw with this. If I do not specify that check currently, then we will "reuse an MOSB" the first time we opt into OCL. This needs to be worked out. I forget if that firstOptIn signal actually works lol
// but populates its status from oldMosb so that no build actually runs. | ||
func (b *buildReconciler) reuseImageForNewMOSB(ctx context.Context, mosc *mcfgv1.MachineOSConfig, oldMosb *mcfgv1.MachineOSBuild, | ||
) error { | ||
// Look up the MCP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might need to add a check here to verify that the image still exists in the registry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should still exist? is there a scenario that the MOSB would cite an image but it doesn't exist in the registry? Becuase we are reusing the image from the existing build onto the new MOSB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if the user deletes the image from the registry, the MOSB has no idea that it is gone. There was a bug about this, so we added a skopeo inspect
check to ensure that the image still exists before continuing. If it doesn't, we will rebuild the MOSB again and this is the other scenario in which a rebuild with the same name happens. These are the PRs that fixed it #4975 and #4807
} | ||
|
||
klog.Info("getting nodes for pool") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add the pool name that we are getting the nodes for.
Same for the other info logs below, but maybe those were for your own debugs and have been left behind
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
going to clean those logs up at the end. this was for debugging purposes
@@ -676,9 +676,9 @@ func (dn *Daemon) initializeNode() error { | |||
//nolint:gocyclo | |||
func (dn *Daemon) syncNode(key string) error { | |||
startTime := time.Now() | |||
klog.V(4).Infof("Started syncing node %q (%v)", key, startTime) | |||
klog.Infof("Started syncing node %q (%v)", key, startTime) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to change the level of this log? Same for the one right below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will change back when PR looks good, this was for debugging purposes.
if err != nil { | ||
return err | ||
} | ||
newMosb.SetOwnerReferences([]metav1.OwnerReference{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought (non-blocking): We may want to push the SetOwnerReferences()
part into the NewMachineOSBuild()
constructor since we already have the MOSC there and it feels like something the MOSB constructor should be setting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah this got a little hacky because I was trying to reuse the preexisting functions to reuse the MOSB but it would feed into creating a new job regardless. ill look into this.
- What I did
- How to verify it
- Description for the changelog