-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Race conditions in Targeted Deletion of machines by CA #341
Fix Race conditions in Targeted Deletion of machines by CA #341
Conversation
klog.Infof("machine deployment %s is under rolling update, skipping", machineDeployment.Name) | ||
return nil | ||
} | ||
markedMachines := sets.New(strings.Split(mcd.Annotations[machinesMarkedByCAForDeletion], ",")...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move this to a small function mcm.getMachinesMarkedByCAForDeletion(mcd) (machineNames sets.Set[string])
which is unit testable and can be consumed by both the mcm_cloud_provider.go
and mcm_manager.go
klog.Errorf("[Refresh] failed to get machines for machine deployment %s, hence skipping it. Err: %v", machineDeployment.Name, err.Error()) | ||
return err | ||
} | ||
var incorrectlyMarkedMachines []*Ref |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we can omit the Ref
struct and just used types.NamespacedName
which is already being used anyways in this file
@@ -539,12 +567,17 @@ func buildMachineDeploymentFromSpec(value string, mcmManager *McmManager) (*Mach | |||
return machinedeployment, nil | |||
} | |||
|
|||
func getMachinesMarkedByCAForDeletion(mcd *v1alpha1.MachineDeployment) sets.Set[string] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps using the "comma, ok" idiom here for mcd.Annotations[machinesMarkedByCAForDeletion]
would be good for clarity to avoid unnecessary split of empty string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it allowed to read from a nil map? Only writes will cause panic, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getMachinesMarkedByCAForDeletion
->getMachineNamesMarkedByCAForDeletion
// ignore the machine deployment if it is in rolling update | ||
if !isRollingUpdateFinished(mcd) { | ||
klog.Infof("machine deployment %s is under rolling update, skipping", machineDeployment.Name) | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we return an error
here ? We are doing it for other cases of !isRollingUpdateFinished
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should remove this check. Even if a machineDeployment is under rolling update, we should allow the annotation update if needed. wdyt?
// addNodeGroup adds node group defined in string spec. Format: | ||
// minNodes:maxNodes:namespace.machineDeploymentName | ||
func (m *McmManager) addNodeGroup(spec string) error { | ||
machineDeployment, err := buildMachineDeploymentFromSpec(spec, m) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
buildMachineDeploymentFromSpec
should also be moved to mcm_manager.go
.
func (machineDeployment *MachineDeployment) Refresh() error { | ||
machineDeployment.scalingMutex.Lock() | ||
defer machineDeployment.scalingMutex.Unlock() | ||
mcd, err := machineDeployment.mcmManager.machineDeploymentLister.MachineDeployments(machineDeployment.Namespace).Get(machineDeployment.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is repeated nearly a dozen times everywhere including common error handling: machineDeployment.mcmManager.machineDeploymentLister.MachineDeployments(machineDeployment.Namespace).Get(machineDeployment.Name)
. Move to a method in mcmManager
called GetMachineDeploymentResource
which returns a formatted error that can simply be returned, so that error message is fully consistent with all uses. we are already having methods like mcmManager.getMachinesForMachineDeployment
so this matches the existing convention.
if machine.Status.CurrentStatus.Phase == v1alpha1.MachineTerminating || machine.Status.CurrentStatus.Phase == v1alpha1.MachineFailed { | ||
continue | ||
} | ||
if annotValue, ok := machine.Annotations[machinePriorityAnnotation]; ok && annotValue == priorityValueForCandidateMachines && !markedMachines.Has(machine.Name) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, Why is this called priorityValueForCandidateMachines
? Shouldn't it be defined as const PriorityDeletionValueForCandidateMachine = 1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can rename it to PriorityValueForDeletionCandidateMachine
. wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should also add a comment here to explain what we are doing so next guy making a patch doesn't scratch his head.
} | ||
} | ||
clone := mcd.DeepCopy() | ||
clone.Annotations[machinesMarkedByCAForDeletion] = strings.Join(updatedMarkedMachines, ",") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This strings.Join
to construct the annotation repeated elsewhere. Make a utility method called getMarkedForDeletionAnnotationValue(machineNames []string) string
klog.Errorf("[Refresh] failed to get machines for machine deployment %s, hence skipping it. Err: %v", machineDeployment.Name, err.Error()) | ||
return err | ||
} | ||
var incorrectlyMarkedMachines []*types.NamespacedName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor (need not correct): types.NamespacedName
is a simple struct - a value object. Modern programming practices discourages use of pointers to value objects for keeping data inline for processor cache performance. Pointers should be used only for structs holding active resource (like locks/files/etc) or class/service objects. Using a simple []types.NamespacedName
and using types.NamespacedName
instead of *types.NamespacedName
should be followed for newer code.
@@ -539,12 +567,17 @@ func buildMachineDeploymentFromSpec(value string, mcmManager *McmManager) (*Mach | |||
return machinedeployment, nil | |||
} | |||
|
|||
func getMachinesMarkedByCAForDeletion(mcd *v1alpha1.MachineDeployment) sets.Set[string] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Name function to getMachineNamesMarkedByCAForDeletion
@@ -508,27 +474,32 @@ func (m *McmManager) DeleteMachines(targetMachineRefs []*Ref) error { | |||
if !isRollingUpdateFinished(md) { | |||
return fmt.Errorf("MachineDeployment %s is under rolling update , cannot reduce replica count", commonMachineDeployment.Name) | |||
} | |||
markedMachines := getMachinesMarkedByCAForDeletion(md) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
markedMachines
-> machineNamesMarkedByCA
nodeNames := getNodeNames(nodes) | ||
klog.V(0).Infof("Received request to delete nodes:- %v", nodeNames) | ||
size, err := machinedeployment.mcmManager.GetMachineDeploymentSize(machinedeployment) | ||
func (ngImpl *nodeGroup) DeleteNodes(nodes []*apiv1.Node) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider multiple calls to DeleteNodes
for the same node object N1
. I am not able to find the place where we are making sure that this does not cause multiple scale downs. Can you please show me where we have handled this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rishabh-11 I have now separated the computation logic into computeScaledownData
inside mcm_manager.go
and added unit tests for duplicate and overlapping scale-downs.
@rishabh-11 unfortunely PR is failing due to |
@@ -1075,3 +1075,35 @@ func filterExtendedResources(allResources v1.ResourceList) (extendedResources v1 | |||
}) | |||
return | |||
} | |||
|
|||
// computeScaledownData computes fresh scaleDownData for the given MachineDeployment given the machineNamesForDeletion | |||
func computeScaledownData(md *v1alpha1.MachineDeployment, machineNamesForDeletion []string) (data scaleDownData) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you rename the method to computeScaleDownData
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
corrected.
alreadyMarkedSet := sets.New(getMachineNamesTriggeredForDeletion(md)...) | ||
|
||
uniqueForDeletionSet := forDeletionSet.Difference(alreadyMarkedSet) | ||
toBeMarkedSet := alreadyMarkedSet.Union(forDeletionSet) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you are using sets of strings then why do you need to get a difference and then take a union later on?
You can just take the existing and insert all new ones. Sets will anyways ensure that there are no duplicates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand - it makes no difference since sets.Insert(slice...)
will return a new set anyways. We are using union since parameter is a set and not a slice.
toBeMarkedSet := alreadyMarkedSet.Union(forDeletionSet) | ||
|
||
data.RevisedToBeDeletedNames = uniqueForDeletionSet | ||
data.RevisedScaledownAmount = uniqueForDeletionSet.Len() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the purpose of RevisedScaledownAmount
if its always going to be the length of RevisedToBeDeletedNames
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was just for documented legibility so callers can simply use scaleDownData.RevisedScaledownAmount
. Its just a single int
in the output type to offer clarity in consumption and logging.
} else if expectedReplicas < 0 { | ||
klog.Errorf("Cannot delete machines in MachineDeployment %q, expected decrease in replicas: %d is more than current replicas: %d", md.Name, data.RevisedScaledownAmount, md.Spec.Replicas) | ||
} else { | ||
mdCopy := md.DeepCopy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a need to create a MachineDeployment copy and keep it in the scaleDownData
struct?
This could have been easily done by the caller and given that there is a single caller of this method this was not really necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I believe this is needed for computational symmetry and coalescing of responsibility. This function takes in an input MachineDeployment
and input parameter and returns an output parameter scaledownData
that contains a revised MachineDeployment
. The callers job is only to update the store and not do any construction/modification of the MachineDeployment
. If there are further changes to the revised MachineDeployment
it will be fully encapsulated within computeScaleDownData
.
CA Integration Test RunIntegration Test Run of specs for standard scale-up, scale-down, test scale-down disabled
|
CA Integration Test RunCA Integration Test Run output for NodeGroup min-max, limits, taint-causing-scaling, etc
|
edd7d34
into
gardener:machine-controller-manager-provider
) * [WIP] fix CA marking machines for deletion * [WIP] add mutex for machine deployment * initialise machinedeployment map in mcmManager * add Refresh method in nodegrp implementation * address review comments * address review comments - part 2 * update unit tests, misc code changes * review comments addressed * fixed broken test after refactor * fixed broken test * alternate solution using single annotation for deletion by CA * fixed use of Go 1.23 functions * fixed test * added godoc for AcquireScalingMutex * correct godoc for machinesMarkedByCAForDeletionAnnotation * changed to machineutils.TriggerDeletionByMCM * removed var shadowing, added TODO to godoc for util fns * corr var names, clear mutex acquire, releasing logs * ensured IncreaseSize/DecreaseTargetSize logged delta in mutex acquire/release log * review comments addressed: fixed unit test, adjusted log, removed redundant fn * all code review comments addressed * review feedback: unexport belongs, enforce interface * addreseed review comment, refactored added test for computeScaleDownData * review comment: preset capacity for slices * review feedback: ->computeScaleDownData and revised godoc * cordon nodes fix for when node is disabled for scaledown * fixed unit test string quoting issue --------- Co-authored-by: Rishabh Patel <[email protected]> Co-authored-by: elankath <[email protected]>
* [WIP] fix CA marking machines for deletion * [WIP] add mutex for machine deployment * initialise machinedeployment map in mcmManager * add Refresh method in nodegrp implementation * address review comments * address review comments - part 2 * update unit tests, misc code changes * review comments addressed * fixed broken test after refactor * fixed broken test * alternate solution using single annotation for deletion by CA * fixed use of Go 1.23 functions * fixed test * added godoc for AcquireScalingMutex * correct godoc for machinesMarkedByCAForDeletionAnnotation * changed to machineutils.TriggerDeletionByMCM * removed var shadowing, added TODO to godoc for util fns * corr var names, clear mutex acquire, releasing logs * ensured IncreaseSize/DecreaseTargetSize logged delta in mutex acquire/release log * review comments addressed: fixed unit test, adjusted log, removed redundant fn * all code review comments addressed * review feedback: unexport belongs, enforce interface * addreseed review comment, refactored added test for computeScaleDownData * review comment: preset capacity for slices * review feedback: ->computeScaleDownData and revised godoc * cordon nodes fix for when node is disabled for scaledown * fixed unit test string quoting issue --------- Co-authored-by: Rishabh Patel <[email protected]> Co-authored-by: Rishabh Patel <[email protected]> Co-authored-by: elankath <[email protected]>
What this PR does / why we need it:
This PR fixes the issues noticed in live issues 6120 and 6101. We introduce a mutex in the
NodeGroupImpl
struct (node group implementation) which must be acquired before performing a scale-down or scale-up operation.We also introduce an annotation on the MachineDeployment
TriggerDeletionByMCM = "node.machine.sapcloud.io/trigger-deletion-by-mcm"
whose value denotes the machines CA wants to remove and then atomically update replica count and annotation as one single step. This will help recognize machines for which CA has already reduced the replicas of theMachineDeployment
and prevent it from being duplicated - race conditions are avoided when CA scaledowns are occurring in parallel. The MCM is also updated to check thisTriggerDeletionByMCM
annotation on theMachineDeployment
.The CA no longer directly sets the
MachinePriority=1
annotation on theMachine
objects. This is now done by the MCM controller when reconciling the updatedMachineDeployment
.Which issue(s) this PR fixes:
Fixes #342
Special notes for your reviewer:
Release note: