-
Notifications
You must be signed in to change notification settings - Fork 12
Automatically approves change requests decreasing quotas for Openshift #207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Automatically approves change requests decreasing quotas for Openshift #207
Conversation
It was discovered that Openshift integration did not function because the Openshift allocation would use the same url to make calls to both the account manager and the Openshift API. This resulted in the Openshift API never actually being called. Due to poor error handling by the integration code (more details below), this big went undetected by the functional tests. Appropriate fixes have been added tests, and the Openshift resource type now requires two URLs, one for the account manager, and one for the Openshift API. Regarding the poor error handling, none of the functional Openshift test cases actually checked if the call to `get_federated_user()` returned the expected output. The `get_federated_user()` function itself only emits a log message if the user is not found. This meant that even though `get_federated_user()` never called the Openshift API and would therefore never find the user, the test cases still passed. Additionally, while `_openshift_user_exists()`, the function which was supposed to call the Openshift API, does catch the `kexc.NotFoundError`, this error is not specific enough, as it could be caused by a 404 response made by ANY server (in the case of our tests, the account manager). It was also found that the `RESOURCE_IDENTITY_NAME` attribute, which identifies the idp, was referenced in integration code but never defined, leading to `_openshift_identity_exists` always failing, since `self.id_provider` would always be `None`
The Openshift allocator will now only make the minimal `resourcequota` object for each namespace, with no support for scopes. Most of the integration code and test cases have been adapted from `openshift-acct-mgt`. Notable exclusions were any code pertaining to the `quota.json`[1] and `limits.json`[2]. [1] https://github.com/CCI-MOC/openshift-acct-mgt/blob/master/k8s/base/quotas.json [2] https://github.com/CCI-MOC/openshift-acct-mgt/blob/master/k8s/base/limits.json
…cr) only decreases quotas An edge case is added to prevent approval when quota is requested to be 0. This would normally happen when the user wants to delete their allocation, which is a seperate case from when they only want to decrease quotas. Implementing this feature required several changes: - Several functions in `utils.py` and `tasks.py` to perform automatic approval - New signal handler for when a cr is created. For now, this handler will only perform automatic checks for Openshift allocations - A new function in the Openshift allocator to get resources used (`get_quota_used`) - Slight refactoring on how the Openshift quota is parsed - Unit tests for the functions performing the various checks have been added
def get_quota_used(self, project_id): | ||
resourcequotas = self._openshift_get_resourcequotas(project_id) | ||
moc_quota_used = {} | ||
# TODO Any concerns about this being a list? Can a project have multiple resourcequotas? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the code account for cases where a user's project has more than one ResourceQuota?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just checked and it's possible to create multiple resourcequotas for a namespace but in practice that should be avoided. I'd say if there's more than 1 resourcequota you should just raise an error.
Just posting this for completeness sake. When you have multiple resourcequotas your pods need to meet the limits specified by all
➜ ~ oc get resourcequota
NAME AGE REQUEST LIMIT
naved-project-rq-1 38m persistentvolumeclaims: 0/20 limits.cpu: 1/1, limits.memory: 4Gi/8Gi
naved-project-rq-2 38m persistentvolumeclaims: 0/20 limits.cpu: 1/2, limits.memory: 4Gi/4Gi
rq-1 allows 1 CPU and 8 Gi of memory and rq-2 allows 2 CPU and 4 Gi memory, in effect this meant I was restricted to 1 CPU and 4 Gi of memory.
@@ -52,3 +56,20 @@ def activate_allocation_user_receiver(sender, **kwargs): | |||
def allocation_remove_user_receiver(sender, **kwargs): | |||
allocation_user_pk = kwargs.get('allocation_user_pk') | |||
remove_user_from_allocation(allocation_user_pk) | |||
|
|||
# TODO (Quan): How to/should we do the functional test for this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have a functional test case for this feature? If so, how should I go about manipulating openshift resource usage for the test?
for allocation_attr_cr in allocation_attr_cr_list: | ||
attr_name = allocation_attr_cr.allocation_attribute.allocation_attribute_type.name | ||
if check_if_quota_attr(attr_name): | ||
# TODO (Quan) Copied from `validate_allocations`, I feel like this is very messy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this part of the code, I wanted to invert the quota key mapping so that I could get a mapping from Openshift quota names (i.e limits.cpu
) to their Coldfront counterparts (i.e 'OpenShift Limit on CPU Quota'
). I ended up copying the code from validate_allocations
and was thinking this may be a bit messy.
Not something blocking this PR, but I felt compelled to mention this, since it felt a bit messy.
This PR is currently on hold since it introduces several independent features. The validation feature has been placed in another PR (#208), and the automatic approval feature will likely to placed in a different PR in the near future. I'll close this after all the relevant PRs have been created |
Partially closes #194. This PR consists of the last commit, and allows automated approval for Openshift allocations that only decrease the usage quota. This PR is built upon #205, since I believe both of these PRs will be approved within the same timeframe. The CI is breaking because this PR also depends on nerc-project/coldfront-nerc#137 for the newly added signal.
I have left some questions down below about a few concerns.
More details in the commit message.