Skip to content

Automatically approves change requests decreasing quotas for Openshift #207

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

QuanMPhm
Copy link
Contributor

@QuanMPhm QuanMPhm commented Apr 7, 2025

Partially closes #194. This PR consists of the last commit, and allows automated approval for Openshift allocations that only decrease the usage quota. This PR is built upon #205, since I believe both of these PRs will be approved within the same timeframe. The CI is breaking because this PR also depends on nerc-project/coldfront-nerc#137 for the newly added signal.

I have left some questions down below about a few concerns.

More details in the commit message.

QuanMPhm added 3 commits April 6, 2025 10:07
It was discovered that Openshift integration did not
function because the Openshift allocation would use
the same url to make calls to both the account
manager and the Openshift API. This resulted in the
Openshift API never actually being called. Due to
poor error handling by the integration code (more details below),
this big went undetected by the functional tests.
Appropriate fixes have been added tests, and the
Openshift resource type now requires two URLs, one
for the account manager, and one for the Openshift API.

Regarding the poor error handling, none of the functional
Openshift test cases actually checked if the call to
`get_federated_user()` returned the expected output.
The `get_federated_user()` function itself only emits a
log message if the user is not found. This meant that
even though `get_federated_user()` never called the
Openshift API and would therefore never find the user, the test cases still passed.

Additionally, while `_openshift_user_exists()`, the function
which was supposed to call the Openshift API, does
catch the `kexc.NotFoundError`, this error is not specific
enough, as it could be caused by a 404 response made
by ANY server (in the case of our tests, the account manager).

It was also found that the `RESOURCE_IDENTITY_NAME`
attribute, which identifies the idp, was referenced in
integration code but never defined, leading to
`_openshift_identity_exists` always failing, since
`self.id_provider` would always be `None`
The Openshift allocator will now only make the minimal
`resourcequota` object for each namespace, with no
support for scopes.

Most of the integration code and test cases have been
adapted from `openshift-acct-mgt`. Notable exclusions
were any code pertaining to the `quota.json`[1] and
`limits.json`[2].

[1] https://github.com/CCI-MOC/openshift-acct-mgt/blob/master/k8s/base/quotas.json
[2] https://github.com/CCI-MOC/openshift-acct-mgt/blob/master/k8s/base/limits.json
…cr) only decreases quotas

An edge case is added to prevent approval when quota is
requested to be 0. This would normally happen when
the user wants to delete their allocation, which is a
seperate case from when they only want to decrease quotas.

Implementing this feature required several changes:
- Several functions in `utils.py` and `tasks.py` to perform automatic approval
- New signal handler for when a cr is created. For now, this
handler will only perform automatic checks for Openshift allocations
- A new function in the Openshift allocator to get resources
used (`get_quota_used`)
- Slight refactoring on how the Openshift quota is parsed
- Unit tests for the functions performing the various checks
have been added
def get_quota_used(self, project_id):
resourcequotas = self._openshift_get_resourcequotas(project_id)
moc_quota_used = {}
# TODO Any concerns about this being a list? Can a project have multiple resourcequotas?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the code account for cases where a user's project has more than one ResourceQuota?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just checked and it's possible to create multiple resourcequotas for a namespace but in practice that should be avoided. I'd say if there's more than 1 resourcequota you should just raise an error.

Just posting this for completeness sake. When you have multiple resourcequotas your pods need to meet the limits specified by all

➜  ~ oc get resourcequota
NAME                 AGE   REQUEST                        LIMIT
naved-project-rq-1   38m   persistentvolumeclaims: 0/20   limits.cpu: 1/1, limits.memory: 4Gi/8Gi
naved-project-rq-2   38m   persistentvolumeclaims: 0/20   limits.cpu: 1/2, limits.memory: 4Gi/4Gi

rq-1 allows 1 CPU and 8 Gi of memory and rq-2 allows 2 CPU and 4 Gi memory, in effect this meant I was restricted to 1 CPU and 4 Gi of memory.

@@ -52,3 +56,20 @@ def activate_allocation_user_receiver(sender, **kwargs):
def allocation_remove_user_receiver(sender, **kwargs):
allocation_user_pk = kwargs.get('allocation_user_pk')
remove_user_from_allocation(allocation_user_pk)

# TODO (Quan): How to/should we do the functional test for this?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a functional test case for this feature? If so, how should I go about manipulating openshift resource usage for the test?

for allocation_attr_cr in allocation_attr_cr_list:
attr_name = allocation_attr_cr.allocation_attribute.allocation_attribute_type.name
if check_if_quota_attr(attr_name):
# TODO (Quan) Copied from `validate_allocations`, I feel like this is very messy
Copy link
Contributor Author

@QuanMPhm QuanMPhm Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this part of the code, I wanted to invert the quota key mapping so that I could get a mapping from Openshift quota names (i.e limits.cpu) to their Coldfront counterparts (i.e 'OpenShift Limit on CPU Quota'). I ended up copying the code from validate_allocations and was thinking this may be a bit messy.

Not something blocking this PR, but I felt compelled to mention this, since it felt a bit messy.

@QuanMPhm QuanMPhm changed the title 194/auto approve change request Automatically approves change requests decreasing quotas for Openshift Apr 8, 2025
@QuanMPhm
Copy link
Contributor Author

QuanMPhm commented May 6, 2025

This PR is currently on hold since it introduces several independent features. The validation feature has been placed in another PR (#208), and the automatic approval feature will likely to placed in a different PR in the near future. I'll close this after all the relevant PRs have been created

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automatically approve a change request that is only decreasing quota values
2 participants