-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add: taints for machinePools #187
base: main
Are you sure you want to change the base?
add: taints for machinePools #187
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Thanks for this PR, please make sure we patch taints and change tolerations in the reuse scenario. |
Thanks @asm582 |
|
Hey, I've spent some time looking into it. Great idea, I'll open a new issue and get reuse added for each machine type :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, left two comments/requested changes.
machinePoolTaint := cmv1.NewTaint(). | ||
Key(aw.Name). | ||
Value("value1"). | ||
Effect("PreferNoSchedule") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we wish to use for the taints effect PreferNoSchedule
as opposed to NoSchedule
? The first will still allow other pods without the toleration to be assigned to that node. I think NoSchedule
would best suit our use case.
@@ -74,8 +74,13 @@ func (r *AppWrapperReconciler) scaleMachinePool(ctx context.Context, aw *arbv1.A | |||
m[aw.Name] = aw.Name | |||
klog.Infof("The instanceRequired array: %v", userRequestedInstanceType) | |||
|
|||
machinePoolTaint := cmv1.NewTaint(). | |||
Key(aw.Name). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the unlikely but possible scenarios where the key (aw.name) in one namespace is the same name but on another namespace. What would occur? Should we have the key include the namespace too?
Issue link
What changes have been made
Taints are being implemented to manage the dispatch of appwrappers. When an appwrapper is deployed, it's crucial that its specified nodes are directed to the appropriate scaled-up instances. This mechanism is particularly vital in team-based clusters that scale up multiple instances for various appwrappers. The currrent workflow potentially allows appwrappers to distribute their workload across scaled-up instances not assigned to them, leading to resource mixups.
By applying taints, we ensure that pods are assigned to the correct nodes, preventing such issues.
The same will be done for nodepools separately.
Verification steps
Deploy an appwrapper with instascale enabled, and specified instances.
View you scaled up machinepool in the dashboard, and check that a taint has been applied with
<aw-name> = value1:PreferNoSchedule
.Once the nodes are scaled you can again check within
compute
andnodes
. Feel free to filter with labels=aw-name
=aw-name
. When viewing each nodesDetails
you will see again the same taint is visible.Ensure the appwrappers pods are deploying successfully to these nodes.
To ensure the opposite, feel free to scale up resources with instascale, and deploy a separate appwrapper, ensuring this appwrappers pods are not deployed to the scaled up resources.
Checks