-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discover Instance Type Capacity Memory Overhead Instead of vmMemoryOverheadPercent
#716
Comments
vmMemoryOverheadPercent
Linking one of the analyses that were performed: aws/karpenter-provider-aws#3568 (comment) |
@jonathan-innis do you have a bit more information on why EC2 reports one value and the kubelet reports another? Would I be right is assuming, based on the current description of this, that this is a Karpenter specific issue and not directly related to |
The The following kubelet configuration works well for us. We are shoving most of the reserved memory into def calc_mem_reservation_mib(total_memory_mib: int) -> int:
# Calculation used by GKE as defined in
# https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-architecture#memory_cpu
# 4G 8G 16G 128G 512G 512G_total
# 1024 + 819 + 819 + 6881 + 7864 = 17407
# This seems safer than the AWS rules of 255MiB + 11MiB * MAX_PODS_PER_INSTANCE
# which empirically doesn't reserve enough for some instance types.
if total_memory_mib <= 4096:
return int(total_memory_mib * 0.25)
elif total_memory_mib <= 8192:
return calc_mem_reservation_mib(4096) + int((total_memory_mib - 4096) * 0.2)
elif total_memory_mib <= 16384:
return calc_mem_reservation_mib(8192) + int((total_memory_mib - 8192) * 0.1)
elif total_memory_mib <= 131072:
return calc_mem_reservation_mib(16384) + int((total_memory_mib - 16384) * 0.06)
else:
return calc_mem_reservation_mib(131072) + int(
(total_memory_mib - 131072) * 0.02
) An example kubelet configuration based on the above calculation we specify in our provisioner for 512Gi instances: kubeletConfiguration:
clusterDNS:
- 10.200.32.10
containerRuntime: containerd
evictionHard:
memory.available: 100Mi
nodefs.available: 10%
nodefs.inodesFree: 10%
evictionSoft:
memory.available: 200Mi
evictionSoftGracePeriod:
memory.available: 1m0s
kubeReserved:
cpu: 230m
memory: 17407Mi
maxPods: 685
systemReserved:
memory: 100Mi Some example capacities and allocatable fractions:
We've been running with a
This seems problematic, since it would require launching a node before knowing if it would fit the desired pod. This new node might not fit the desired pod, but may fit some other pod that might otherwise have gone somewhere else. The default K8S scheduler configuration spreads pods onto the least full nodes, which would likely hit this behavior.
This would be great! Alternatively, at least making the value a part of the |
Yes, that's correct. This has to do with the |
This one was erroneously transferred. Because this repo now exists in a new organization, I'm going to re-create and link this original issue over in |
Closing this one in favor of aws/karpenter-provider-aws#5161. We can move the majority of the conversation over there if there's anything to add. |
Tell us about your request
We could consider a few options to discover the expected capacity overhead for a given instance type:
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
Calculating the difference between the EC2-reported memory capacity and the actual capacity of the instance as reported by kubelet.
Are you currently working around this issue?
Using a heuristic
vmMemoryOverheadPercent
value right now that is tunable by users and passed throughkarpenter-global-settings
Additional Context
No response
Attachments
No response
Community Note
The text was updated successfully, but these errors were encountered: