You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OCPBUGS-23514: Better a message upon long CO updating
When it takes too long (90m+ for machine-config and 30m+ for
others) to upgrade a cluster operator, clusterversion shows
a message with the indication that the upgrade might hit
some issue.
This will cover the case in the related OCPBUGS-23538: for some
reason, the pod under the deployment that manages the CO hit
CrashLoopBackOff. Deployment controller does not give useful
conditions in this situation [1]. Otherwise, checkDeploymentHealth [2]
would detect it.
Instead of CVO's figuring out the underlying pod's
CrashLoopBackOff which might be better to be implemented by
deployment controller, it is expected that our cluster admin
starts to dig into the cluster when such a message pops up.
For now, we just modify the condition's message. We could propagate
Fail=True in case such requirements are collected from customers.
[1]. kubernetes/kubernetes#106054
[2]. https://github.com/openshift/cluster-version-operator/blob/08c0459df5096e9f16fad3af2831b62d06d415ee/lib/resourcebuilder/apps.go#L79-L136
0 commit comments