Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slo-controller: fix mid resource calculate formula #2291

Conversation

lijunxin559
Copy link
Contributor

Ⅰ. Describe what this PR does

When calculating Allocatable[mid] resources, due to possible oversold, ProdReclaimableMetric will be greater than NodeAllocatable * thresholdRatio, so the calculated Allocatable[mid] value accidentally includes the oversold part.
In particular, when thresholdRatio = 1, Allocatable[mid] = NodeAllocatable, which is not the original intention of the koordinator resource model design.

I suggest taking the current free resources of this node into account when calculating Allocatable[mid], that is, Allocatable[Mid]' := min(Reclaimable[Mid], NodeAllocatable * thresholdRatio, FreeResourceOnNode) + Unallocated[Mid] * midUnallocatedRatio

Ⅱ. Does this pull request fix one issue?

Ⅲ. Describe how to verify it

Ⅳ. Special notes for reviews

V. Checklist

  • I have written necessary docs and comments
  • I have added necessary unit tests and integration tests
  • All checks passed in make test

Copy link

codecov bot commented Dec 9, 2024

Codecov Report

Attention: Patch coverage is 46.51163% with 23 lines in your changes missing coverage. Please review.

Project coverage is 66.03%. Comparing base (7c94119) to head (ef6b1f9).
Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
...g/slo-controller/noderesource/plugins/util/util.go 0.00% 23 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2291      +/-   ##
==========================================
- Coverage   66.04%   66.03%   -0.02%     
==========================================
  Files         457      457              
  Lines       53737    53785      +48     
==========================================
+ Hits        35490    35516      +26     
- Misses      15697    15719      +22     
  Partials     2550     2550              
Flag Coverage Δ
unittests 66.03% <46.51%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@lijunxin559 lijunxin559 changed the title Fix mid resource calculate formula slo-controller: fix mid resource calculate formula Dec 9, 2024
@lijunxin559 lijunxin559 force-pushed the fix-midResource-calculate-formula branch 3 times, most recently from c310647 to 51cae0a Compare December 10, 2024 01:50
@saintube
Copy link
Member

saintube commented Dec 10, 2024

@lijunxin559 Thanks for your contribution.

Allocatable[Mid]' := min(Reclaimable[Mid], NodeAllocatable * thresholdRatio, FreeResourceOnNode) + Unallocated[Mid] * midUnallocatedRatio

I suppose the FreeResourceOnNode equals Unallocated[Mid], so what does the formula mean? If the node has no unallocated Prod resources, the allocatable Mid in the new formula will be zero, which obeys the expectation of the Prod overcommitment.
To avoid the Prod allocatable being overcommitted twice (i.e. overcommit the unused overcommitted Prod requests) in the node resource controller, we need to minimize it with the ProdAllocated and Node capacity. And we'd better revise the new peak predictor in the koordlet, e.g. the PriorityPeakPredictor should minimize the node capacity as the limit of the peak. WDYT?

@lijunxin559
Copy link
Contributor Author

lijunxin559 commented Dec 10, 2024

@lijunxin559 Thanks for your contribution.

Allocatable[Mid]' := min(Reclaimable[Mid], NodeAllocatable * thresholdRatio, FreeResourceOnNode) + Unallocated[Mid] * midUnallocatedRatio

I suppose the FreeResourceOnNode equals Unallocated[Mid], so what does the formula mean? If the node has no unallocated Prod resources, the allocatable Mid in the new formula will be zero, which obeys the expectation of the Prod overcommitment. To avoid the Prod allocatable being overcommitted twice (i.e. overcommit the unused overcommitted Prod requests) in the node resource controller, we need to minimize it with the ProdAllocated and Node capacity. And we'd better revise the new peak predictor in the koordlet, e.g. the PriorityPeakPredictor should minimize the node capacity as the limit of the peak. WDYT?

Morning, thanks for your reply ~ (^^)

Allocatable[Mid]' := min(Reclaimable[Mid], NodeAllocatable * thresholdRatio) + Unallocated[Mid] * midUnallocatedRatio is the original code logic. I think Unallocated[Mid] is intended to express the part of resources that are not allocated to prod. I just added a limit on the actual idleness of a node, which does not conflict with the original formula.

Yes, the predictor improvement you mentioned can also avoid secondary overselling to a certain extent, but I think the logic of dividing the predictor and the calculation of mid resources into two parts may be clearer. The predictor can focus on modeling and predicting the idle resources of the pod and ensure the reliability of the prediction, regardless of whether this part of the allocated resources actually exists or not. And the secondary utilization of this part of the idle resources is separated separately (mid and batch were also done before), so that the logic is decoupled and the subsequent use of the predictor function is more scalable. Right?

@saintube
Copy link
Member

Morning, thanks for your reply ~ (^^)

Allocatable[Mid]' := min(Reclaimable[Mid], NodeAllocatable * thresholdRatio) + Unallocated[Mid] * midUnallocatedRatio is the original code logic. I think Unallocated[Mid] is intended to express the part of resources that are not allocated to prod. I just added a limit on the actual idleness of a node, which does not conflict with the original formula.

Yes, the predictor improvement you mentioned can also avoid secondary overselling to a certain extent, but I think the logic of dividing the predictor and the calculation of mid resources into two parts may be clearer. The predictor can focus on modeling and predicting the idle resources of the pod and ensure the reliability of the prediction, regardless of whether this part of the allocated resources actually exists or not. And the secondary utilization of this part of the idle resources is separated separately (mid and batch were also done before), so that the logic is decoupled and the subsequent use of the predictor function is more scalable. Right?

@lijunxin559 OK. So the FreeResourceOnNode means the NodeCapacity - NodeUsage. It seems reasonable.

@lijunxin559 lijunxin559 force-pushed the fix-midResource-calculate-formula branch 3 times, most recently from 6746896 to 65514b4 Compare December 10, 2024 09:03
@lijunxin559 lijunxin559 force-pushed the fix-midResource-calculate-formula branch 3 times, most recently from 1dd0e44 to 2924f9f Compare December 10, 2024 12:06
@lijunxin559 lijunxin559 force-pushed the fix-midResource-calculate-formula branch from 2924f9f to ef6b1f9 Compare December 10, 2024 12:32
Copy link
Member

@saintube saintube left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@saintube saintube added the lgtm label Dec 11, 2024
@saintube
Copy link
Member

PTAL /cc @zwzhang0107 @hormes

@koordinator-bot koordinator-bot bot merged commit 9bb39a2 into koordinator-sh:main Dec 19, 2024
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants