-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to set node affinity and/or tolerations on RKE-deployed addon manifests #1529
Comments
Big +1 to get this added ASAP. Thanks! |
+1 for us as well. We have master nodes that act as etcd + controlplane node but run our critical stuf like coredns and ingress on those 2 nodes. Easier to create Load balancing and troubleshooting. Given this limitation, we spin cluster using RKE then modify the daemonset config to allow tolerations for etcd + controlepane and with the node_selection set, the pods lift on the master nodes. the downside, we never touch RKE after the deployments to not overwrite daemonset configs. hope this makes it soon enough. |
With the ability to add taints to nodes adding in RKE v0.3.0 (#157), all system add-ons will now be set with a wildcard toleration for the worker nodes. The taints will be used to request that these add-ons are not scheduled onto these nodes. By introducing/adding node selector (rancher/rancher#22447) for the remaining RKE add-ons, you can use node selectors to determine which nodes these system add-ons should be scheduled to. With the combination of taints and node tolerations, you will be able to plan out which nodes your system add-ons should be deployed to. I am closing this issue in favor of the other two issues listed above. |
@deniseschannon what issue did you keep open for this? |
I don't believe that rancher/rancher#22447 or #157 solve this issue, so I am reopening it. |
Updating to latest discovery. e.g CoreDNS Deployment tolerations:
- key: CriticalAddonsOnly
operator: Exists
- effect: NoExecute
operator: Exists
- effect: NoSchedule
operator: Exists
...
nodeSelector:
beta.kubernetes.io/os: linux
dns_nodes: coredns
The problem is in the Affinity declaration: spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/worker
operator: Exists Setting this to |
I wonder why is there tolerations:
- effect: NoExecute
operator: Exists
- effect: NoSchedule
operator: Exists If I understand it correctly it means that coredns may be scheduled on cordoned nodes which may become unavailable e.g. due to the OS upgrade or other system maintenance. Or am I missing something? |
This can be configured per v1.2.4, see https://rancher.com/docs/rke/latest/en/config-options/add-ons/#tolerations and the add-on specific pages in the docs, like https://rancher.com/docs/rke/latest/en/config-options/add-ons/dns/#coredns-tolerations |
That's actually great, thank you for pointing into the right direction. |
With Terraform provider rancher/rancher2 1.17 it is now possible to set tolerations and node selector. |
Any updates on this topic please? |
Really would like to be able to schedule some add-ons like Nginx-Ingress on controlplanes nodes. The nodeAffinity which only allow the Workers role is blocking this. And like stated before the nodeSelector feature is only working on Workers and therefor no solution. So why not wasting resources and put this topic on the roadmap asap? |
The ability to override the affinity would be really useful for us where we scaled down worker nodes at night and allowing DNS to run on a control plane speeds up the cluster startup in the morning as CoreDNS is critical to startup. Seems silly to deploy a cluster and then edit the Coredns Deployment manually afterwards to remove this affinity, Of course thdn make it available in the Terraform Provider as well |
Any updates on this please? |
RKE does not currently allow you to set tolerations or affinity rules on the deployment/daemonset manifests that it deploys, for components like CoreDNS, kube-dns, nginx-ingress, etc.
This is a critical feature required in order to allow for architecting robust highly available clusters when planning for minimizing noisy neighbor problems.
Related issues to this are:
#1066
#1365
The text was updated successfully, but these errors were encountered: