-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Withdraw site resource when vm creation is failed #300
Open
kimeunju108
wants to merge
26
commits into
CentaurusInfra:master
Choose a base branch
from
kimeunju108:pr-289
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
26 commits
Select commit
Hold shift + click to select a range
74f1776
changed klog.Infof to klog.V(4).Infof
kimeunju108 207f229
updated code - removed else and added continue
kimeunju108 86cde93
Merge branch 'master' of https://github.com/CentaurusInfra/global-res…
kimeunju108 c75e438
implemented withdraw reserved resource for pod
kimeunju108 0c14cbe
implemented resource revokation when vm creation failed
kimeunju108 bba5311
implemented resource revokation when vm creation failed
kimeunju108 7773eeb
implemented resource revokation when vm creation failed
kimeunju108 3b5a1c3
applied review
kimeunju108 f1e1542
updated according to review
kimeunju108 fe2ba14
updated resource data structure
kimeunju108 ec4735b
applied review
kimeunju108 4a49283
pulled from pr 289
kimeunju108 80033e6
applied reviews
kimeunju108 b6e04d9
performed CICD test
kimeunju108 37f0e04
add unit testcase
kimeunju108 d05b4f9
add unit testcase
kimeunju108 414d752
added unit testcase
kimeunju108 0416df5
added unit testcase
kimeunju108 504fe0b
updated scheduler according to review
kimeunju108 baa3a96
updated scheduler according to review
kimeunju108 3d3664a
updated eventhandlers_test.go
kimeunju108 ed4c2b1
updated eventhandlers_test.go
kimeunju108 0dfb890
updated eventhandlers_test.go
kimeunju108 ea63fd4
updated eventhandlers_test.go
kimeunju108 1f6026d
updated eventhandlers_test.go
kimeunju108 c174688
updated selector
kimeunju108 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
map is not thread safe. When multiple events of one pod are triggered, how can we prevent synchronization issues here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is that a scheduled job will sync resources(cpu & mem) from openstack. Before informers get failed pods from apiserver, the scheduled job has already synchronized from openstack, which means the resource has already claimed back. In that case, is it possible that the failed pod adds resources back twice?
=> No it won't happen because PodSiteResourceMap is only for 60 seconds time gap. So whenever sync resources(cpu & mem) from openstack, this PodSiteResourceMap will be empty at the moment already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please reply to my comments instead of editing them directly.
The assumption is based on informers take fewer than 60 seconds to trigger the events. However, if at that time the resource map is not empty but the scheduled resource synchronization job happens to fetch resource information from openstack, the resources might be wrong.
=> resource map is empty, so the conflict won't happen. If this function creates too much issue, we should think again if we remove this function. It will be better to remove this 60 seconds gap. or resource collector should collect information in advance before waiting 60 seconds. 60 secnds is resource collector's issue mainly. Resource collection has to keep collecting information to remove this gap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please reply to my comments instead of editing them directly.
For example, when you get failed pod event and want to withdraw resources, we assume to update resource based on resource collector does not update openstack resource information at that time. But if it does, we withdraw resource information twice.
We can happen to get a failed pod event just after resource collector updates the most recent openstack resource information, right?
Please correct me if I am wrong.
=> Regardless right or not, collecting resource information is resource collector's work. I tried to cooperate 60 seconds issue of resource collctor. But if there are so many issues, it will be better resource collector take it and solve it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please reply to my comments instead of editing them directly.
Resource collector used to be the only source we know the resource information (mem & cpu). If you introduce the current codes to update resources, please make them work with each other. Thanks
=> It will be better for you to check resource collector's requirement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please reply to my comments instead of editing them directly.
Can you tell us besides resource collector, are the pull request to reduce/withdraw resources to update the openstack resources? Thanks
=> No, this PR for "withdraw", which resolves the issue caused by resource collector issue. It will be better to reconsider if this function should be implemented by scheduler or not.