-
Notifications
You must be signed in to change notification settings - Fork 425
OCPBUGS-7747: Do not set cpu system reserve below the default value #5046
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Harshal Patil <[email protected]>
@harche: This pull request references Jira Issue OCPBUGS-7747, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/hold for testing. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: harche The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/jira refresh |
@harche: This pull request references Jira Issue OCPBUGS-7747, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira ([email protected]), skipping review request. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
I created a standalone script for testing, $ cat test.sh
#!/bin/bash
set -e
VERSION_1=1
VERSION_2=2
function dynamic_cpu_sizing {
total_cpu=$2 # passed-in for testing
if [ -z "$total_cpu" ]; then
total_cpu=$(getconf _NPROCESSORS_ONLN)
fi
if [ "$1" -eq "$VERSION_1" ]; then
recommended_systemreserved_cpu=0
if (($total_cpu <= 1)); then
recommended_systemreserved_cpu=$(echo "$total_cpu * 0.06" | bc -l)
total_cpu=0
else
recommended_systemreserved_cpu=0.06
total_cpu=$((total_cpu - 1))
fi
if (($total_cpu <= 1)); then
recommended_systemreserved_cpu=$(echo "$recommended_systemreserved_cpu + ($total_cpu * 0.01)" | bc -l)
total_cpu=0
else
recommended_systemreserved_cpu=$(echo "$recommended_systemreserved_cpu + 0.01" | bc -l)
total_cpu=$((total_cpu - 1))
fi
if (($total_cpu <= 2)); then
recommended_systemreserved_cpu=$(echo "$recommended_systemreserved_cpu + ($total_cpu * 0.005)" | bc -l)
total_cpu=0
else
recommended_systemreserved_cpu=$(echo "$recommended_systemreserved_cpu + 0.01" | bc -l)
total_cpu=$((total_cpu - 2))
fi
if (($total_cpu >= 0)); then
recommended_systemreserved_cpu=$(echo "$recommended_systemreserved_cpu + ($total_cpu * 0.0025)" | bc -l)
fi
else
base_allocation_fraction=0.06
increment_per_cpu_fraction=0.012
if ((total_cpu > 1)); then
recommended_systemreserved_cpu=$(awk -v base="$base_allocation_fraction" -v increment="$increment_per_cpu_fraction" -v cpus="$total_cpu" 'BEGIN {printf "%.3f\n", base + increment * (cpus - 1)}')
else
recommended_systemreserved_cpu=$base_allocation_fraction
fi
fi
# Enforce minimum threshold of 0.5 CPU
recommended_systemreserved_cpu=$(awk -v val="$recommended_systemreserved_cpu" 'BEGIN {if (val < 0.5) print 0.5; else printf "%.3f\n", val}')
echo "SYSTEM_RESERVED_CPU=${recommended_systemreserved_cpu}"
}
function run_cpu_tests {
for version in $VERSION_1 $VERSION_2; do
echo "=== CPU Sizing Tests (Version $version) ==="
for cpu in 1 2 4 8 16 32 64 128 256 512; do
echo -n "CPU Count $cpu, "
dynamic_cpu_sizing $version $cpu
done
echo
done
}
if [[ "$1" == "true" ]]; then
run_cpu_tests
else
echo "Usage: $0 true"
exit 1
fi Looking at the output, it seems like the changes are working as expected, $ ./test.sh true
=== CPU Sizing Tests (Version 1) ===
CPU Count 1, SYSTEM_RESERVED_CPU=0.5
CPU Count 2, SYSTEM_RESERVED_CPU=0.5
CPU Count 4, SYSTEM_RESERVED_CPU=0.5
CPU Count 8, SYSTEM_RESERVED_CPU=0.5
CPU Count 16, SYSTEM_RESERVED_CPU=0.5
CPU Count 32, SYSTEM_RESERVED_CPU=0.5
CPU Count 64, SYSTEM_RESERVED_CPU=0.5
CPU Count 128, SYSTEM_RESERVED_CPU=0.5
CPU Count 256, SYSTEM_RESERVED_CPU=0.710
CPU Count 512, SYSTEM_RESERVED_CPU=1.350
=== CPU Sizing Tests (Version 2) ===
CPU Count 1, SYSTEM_RESERVED_CPU=0.5
CPU Count 2, SYSTEM_RESERVED_CPU=0.5
CPU Count 4, SYSTEM_RESERVED_CPU=0.5
CPU Count 8, SYSTEM_RESERVED_CPU=0.5
CPU Count 16, SYSTEM_RESERVED_CPU=0.5
CPU Count 32, SYSTEM_RESERVED_CPU=0.5
CPU Count 64, SYSTEM_RESERVED_CPU=0.816
CPU Count 128, SYSTEM_RESERVED_CPU=1.584
CPU Count 256, SYSTEM_RESERVED_CPU=3.120
CPU Count 512, SYSTEM_RESERVED_CPU=6.192
|
@harche: This pull request references Jira Issue OCPBUGS-7747, which is valid. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira ([email protected]), skipping review request. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/test unit |
@harche: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
- What I did
If the number of CPUs is large, the recommended system reserved by the auto node sizing could be lower than the default 500m value. This PR descards the value calculated by the auto node sizing script if it is lower than 0.5.
- How to verify it
Enable the auto node sizing with the worker nodes having not so large number of CPUs, and the value set for system reserved cpu should be at least 0.5
- Description for the changelog
System reserved CPU value defaults to 500m if the number of CPUs is not large enough.