-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] feat: add managed model registry prometheus job, metrics, and alering rules, fixes RHOAIENG-4273 #318
base: main
Are you sure you want to change the base?
Conversation
… rules, fixes RHOAIENG-4273
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: dhirajsb The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This is a draft PR with prometheus config changes to the yaml file. I'm creating another PR to enable loading this config in odh incubation. This PR should be merged after the model registry component itself is merged into rhds. Please review and let me know if there are any changes or anything else is needed. |
- alert: Model Registry Operator Probe Success Burn Rate | ||
annotations: | ||
message: 'High error budget burn for {{ $labels.instance }} (current value: {{ $value }}).' | ||
triage: "https://gitlab.cee.redhat.com/service/managed-tenants-sops/-/blob/main/RHODS/Model-Serving/rhods-model-registry-operator-probe-success-burn-rate.md" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one thing before merge this PR is to get https://gitlab.cee.redhat.com/service/managed-tenants-sops/-/blob/main/RHODS/Model-Serving/rhods-model-registry-operator-probe-success-burn-rate.md created.
preferably to name the file as rhoai-model-registry-operator-probe-success-burn-rate.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I noticed that sop link. Is there a process to contact someone for working on it, or can I just open a PR and go from there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to that, I used the simple name Model Registry Operator
in the config names and descriptions. I noticed that some components used ODH prefix. Does model registry need to use RHOAI Model Registry Operator
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
think you can make a PR and ask (someone) in the project to review/approve it.
should be a owner file somewhere with name list there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbh, i do not know the answer. I would say if ODH already exists by other component, then you can use it, or just skip ODH or any downstream name. but better check with project, if any hard/soft rules.
Description
Adds managed model registry prometheus config for the following:
Fixes RHOAIENG-4273
https://issues.redhat.com/browse/RHOAIENG-4273
How Has This Been Tested?
WIP
Screenshot or short clip
Merge criteria