feat: implement clusterStagedUpdateRun execution #1000

jwtty · 2024-12-23T23:50:36Z

Description of your changes

Implement clusterStagedUpdateRun execution. Also made some refactoring.

Fixes #

I have:

Run make reviewable to ensure this PR is ready for review.

How has this code been tested

Special notes for your reviewer

jwtty · 2024-12-24T00:51:14Z

pkg/controllers/updaterun/execution.go

+			continue
+		}
+		// The cluster status is not deleting yet
+		if err := r.Client.Delete(ctx, binding); err != nil {


Compared with the original implementation, I did not check if binding is up-to-date before deleting. I don't think it's necessary. Do we want that?

we do support multiple update run in parallel, I am not sure if there is any race condition that we may delete some other run's binidng.

In case of both resource changes and scheduling policy changes, the binding to be deleted may not have up-to-date resources. If there are concurrent update runs, the same binding could be deleted by either updateRun. If there's a new scheduling policy change that marks this to-be-deleted binding as scheduled again, this updateRun can fail during validation. Or if there's chance that the change happens right after the validation passes and before deleting the bindings, I think I can add a check to make sure the binding's state is unScheduled or report unexpected error if not.

I do have a concern: in rollout controller, we make sure to complete the deletion of the bindings on a cluster before scheduling a newly created binding on the same cluster: https://github.com/Azure/fleet/blob/main/pkg/controllers/rollout/controller.go#L226. I wonder if we should have similar logic in the updateRun controller too.

the concurrent updateRun is not valid if the schedule results are different. This is only for the case one needs to release different versions of the resources so the bindings should all be scheduled or bound.

As for the check no duplicate logic, should the check at

fleet/pkg/controllers/updaterun/initialization.go

Line 177 in cb0c5f7

if binding.Spec.State != placementv1beta1.BindingStateScheduled && binding.Spec.State != placementv1beta1.BindingStateBound {

be suffice?

"the concurrent updateRun is not valid if the schedule results are different. This is only for the case one needs to release different versions of the resources so the bindings should all be scheduled or bound."

In this case, the bindings won't be marked as to-be-deleted because they both have the latest scheduling snapshot index.

pkg/controllers/updaterun/controller.go

pkg/controllers/updaterun/execution.go

ryanzhang-oss · 2024-12-26T14:27:29Z

pkg/controllers/updaterun/execution.go

+			continue
+		}
+		// The cluster status is not deleting yet
+		if err := r.Client.Delete(ctx, binding); err != nil {


we do support multiple update run in parallel, I am not sure if there is any race condition that we may delete some other run's binidng.

pkg/controllers/updaterun/execution.go

pkg/controllers/updaterun/validation.go

pkg/controllers/updaterun/execution_integration_test.go

pkg/controllers/updaterun/controller.go

pkg/controllers/updaterun/execution.go

implement stagedUpdateRun execution

45e5740

jwtty force-pushed the stagerun-exec branch from d10bcca to 45e5740 Compare December 23, 2024 23:57

jwtty commented Dec 24, 2024

View reviewed changes

ryanzhang-oss reviewed Dec 26, 2024

View reviewed changes

fix comments

a369ee5

ryanzhang-oss approved these changes Dec 30, 2024

View reviewed changes

pkg/controllers/updaterun/controller.go Show resolved Hide resolved

pkg/controllers/updaterun/execution.go Show resolved Hide resolved

pkg/controllers/updaterun/execution.go Show resolved Hide resolved

jwtty merged commit 61f7292 into Azure:main Dec 30, 2024
12 checks passed

jwtty deleted the stagerun-exec branch December 30, 2024 19:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement clusterStagedUpdateRun execution #1000

feat: implement clusterStagedUpdateRun execution #1000

jwtty commented Dec 23, 2024

jwtty Dec 24, 2024

ryanzhang-oss Dec 26, 2024

jwtty Dec 27, 2024

jwtty Dec 27, 2024

ryanzhang-oss Dec 30, 2024

ryanzhang-oss Dec 30, 2024

jwtty Dec 30, 2024

ryanzhang-oss Dec 26, 2024

feat: implement clusterStagedUpdateRun execution #1000

feat: implement clusterStagedUpdateRun execution #1000

Conversation

jwtty commented Dec 23, 2024

Description of your changes

How has this code been tested

Special notes for your reviewer

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment