Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OS upgrade implementation for single clusters #20

Merged
merged 16 commits into from
Jul 26, 2024

Conversation

ipetrov117
Copy link
Contributor

@ipetrov117 ipetrov117 commented Jul 25, 2024

Introduces OS upgrades for clusters consisting of one node. Multi-node cluster OS upgrade will be handled in a different PR.

Tested for:

  • 5.5 OS package upgrade + K8s upgrade
  • 5.5 -> 6.0 OS migration + K8s upgrade

Migration screenshots:
Before:
Screenshot 2024-07-25 at 15 44 53

After:
Screenshot 2024-07-25 at 16 00 55

os-upgrade.sh.tpl has been refactored and now supports the following functionalities:

  • Validation for the exit code of the systemd upgrade service. Currently with transactional-update run the exit codes are not propagated well, but when we move to transactional-update migration this functionality will be useful
  • In light of the fact that we use SUC and when an OS reboots the SUC job will restart the pod, a validation was added to the script that ensures that transactional-update will not be called again once the OS reboots
  • To ensure a more smooth systemd.service execution the reboot logic was moved away from the systemd.service. It was causing the service to be abruptly canceled, which might lead to all sorts of problems.
  • The systemd.service cleanup logic has been put behind a trap in order to ensure that even if the script fails unexpectedly, the service will be cleaned up.

Copy link
Contributor

@atanasdinov atanasdinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I got some questions but I'm happy to approve once we discuss them.

func (r *UpgradePlanReconciler) recordCreatedPlan(upgradePlan *lifecyclev1alpha1.UpgradePlan, name, namespace string) {
r.Recorder.Eventf(upgradePlan, corev1.EventTypeNormal, "PlanCreated", "Upgrade plan created: %s/%s", namespace, name)
func (r *UpgradePlanReconciler) createSecret(ctx context.Context, upgradePlan *lifecyclev1alpha1.UpgradePlan, secret *corev1.Secret) error {
if err := r.createObject(ctx, upgradePlan, secret); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach is fine and will work but I might look into simplifying it when I have some time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I did it like this mainly because it was the least invasive method of introducing this in the existing code.

internal/upgrade/templates/os-upgrade.sh.tpl Outdated Show resolved Hide resolved
internal/upgrade/templates/os-upgrade.sh.tpl Outdated Show resolved Hide resolved
internal/upgrade/templates/os-upgrade.sh.tpl Show resolved Hide resolved
internal/upgrade/templates/os-upgrade.sh.tpl Outdated Show resolved Hide resolved
mountPath := filepath.Join("/host", secretPathRelativeToHost)
controlPlanePlan.Spec.Secrets = []upgradecattlev1.SecretSpec{
{
Name: secretName,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it is tempting to just use the constants here but wouldn't it be better if the secret name and script are passed to the function from the outside? Both can be extracted from the secret that we query.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the secretName out of the global constants and added it as a parameter to the plan creation function. I cannot do the same for the scriptName mainly because it is a key of the StringData map of the secret and I do not have a good way of extracting the key name from the secret.StringData map. So I decided to leave the scriptName as a global constant.

@ipetrov117 ipetrov117 merged commit 88fbcd2 into suse-edge:main Jul 26, 2024
2 checks passed
@ipetrov117 ipetrov117 deleted the os-upgrade-single branch August 9, 2024 07:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants