sched: controller: set scheduler priority #979

shajmakh · 2024-08-12T14:38:58Z

So far the scheduler priority is set to default which is 0 this is risky especially when the preemtion of pods is needed to fit more important pods.

The NRS is important enough to deserve the most critical priority class system-node-critical which is the same priority for the kube-scheduler.

addresses #974

shajmakh · 2024-08-12T14:41:29Z

/hold

ffromani · 2024-08-12T15:02:52Z

controllers/numaresourcesscheduler_controller.go

@@ -55,7 +55,8 @@ import (
 )

 const (
-	leaderElectionResourceName = "numa-scheduler-leader"
+	leaderElectionResourceName      = "numa-scheduler-leader"
+	SystemNodeCriticalPriorityClass = "system-node-critical"


please make it private and rename to convey the intent, not the value. Maybe schedulerPriorityClassName

In addition, please record in the commit message why system-node-critical and not system-cluster-critical (I'm fine with the decision, but let's record the reason)

updated, thanks for raising

So far the scheduler priority is set to default which is 0 this is risky especially when the preemtion of pods is needed to fit more important pods. The NRS is important enough to deserve the most critical priority class system-node-critical which is the same priority for the kube-scheduler. We need this priority set always regardless how many replicas are set for the scheduler, and especially if we look to optimize the HA of the scheduler. We choose system-node-critical over system-cluster-critical because we don't want to allow SS preemption by higher-priority pods. If it was set to system-cluster-critical and an event is triggered that requires pod eviction, which would be for scheduling system-node-critical workloads, the SS would be at risk of being evicted. although this would be very rare and the evicted pod will be rescheduled, there is no convincing reason not to make it node-critical. addresses openshift-kni#974 Signed-off-by: Shereen Haj <[email protected]>

shajmakh · 2024-08-12T15:22:39Z

/unhold

ffromani

/approve
/lgtm

thanks!

openshift-ci · 2024-08-12T16:03:03Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ffromani, shajmakh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [ffromani,shajmakh]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

shajmakh · 2024-08-13T05:58:01Z

/cherry-pick release-4.16

openshift-cherrypick-robot · 2024-08-13T05:58:44Z

@shajmakh: #979 failed to apply on top of branch "release-4.16":

Applying: sched: controller: set scheduler priority
Using index info to reconstruct a base tree...
M	controllers/numaresourcesscheduler_controller.go
Falling back to patching base and 3-way merge...
Auto-merging controllers/numaresourcesscheduler_controller.go
CONFLICT (content): Merge conflict in controllers/numaresourcesscheduler_controller.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 sched: controller: set scheduler priority
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release-4.16

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Manual backport of openshift-kni#979 Signed-off-by: Shereen Haj <[email protected]>

openshift-ci bot requested review from ffromani and swatisehgal August 12, 2024 14:39

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 12, 2024

openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 12, 2024

ffromani reviewed Aug 12, 2024

View reviewed changes

shajmakh force-pushed the set-sched-prio branch from 506b82b to 7844680 Compare August 12, 2024 15:19

openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 12, 2024

ffromani reviewed Aug 12, 2024

View reviewed changes

openshift-ci bot assigned ffromani Aug 12, 2024

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 12, 2024

openshift-merge-bot bot merged commit 08a0929 into openshift-kni:main Aug 12, 2024
13 checks passed

shajmakh added a commit to shajmakh/numaresources-operator that referenced this pull request Aug 13, 2024

sched: controller: set scheduler priority

b2d6ac9

Manual backport of openshift-kni#979 Signed-off-by: Shereen Haj <[email protected]>

shajmakh mentioned this pull request Aug 13, 2024

sched: controller: set scheduler priority #980

Merged

ffromani added cherry-pick-candidate Possible cherry-pick in the future cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. labels Aug 13, 2024

shajmakh mentioned this pull request Aug 13, 2024

scheduler: set priorityClass #974

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sched: controller: set scheduler priority #979

sched: controller: set scheduler priority #979

shajmakh commented Aug 12, 2024

shajmakh commented Aug 12, 2024

ffromani Aug 12, 2024

ffromani Aug 12, 2024

shajmakh Aug 12, 2024

shajmakh commented Aug 12, 2024

ffromani left a comment

openshift-ci bot commented Aug 12, 2024

shajmakh commented Aug 13, 2024

openshift-cherrypick-robot commented Aug 13, 2024

sched: controller: set scheduler priority #979

sched: controller: set scheduler priority #979

Conversation

shajmakh commented Aug 12, 2024

shajmakh commented Aug 12, 2024

ffromani Aug 12, 2024

Choose a reason for hiding this comment

ffromani Aug 12, 2024

Choose a reason for hiding this comment

shajmakh Aug 12, 2024

Choose a reason for hiding this comment

shajmakh commented Aug 12, 2024

ffromani left a comment

Choose a reason for hiding this comment

openshift-ci bot commented Aug 12, 2024

shajmakh commented Aug 13, 2024

openshift-cherrypick-robot commented Aug 13, 2024