-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: simplify token audiences config #947
base: incubation
Are you sure you want to change the base?
feat: simplify token audiences config #947
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is great improvement.
I shared a few things which we should think through before merging it.
And on the related note: would it be difficult to write a test for it?
@@ -67,6 +67,7 @@ type DSCInitializationReconciler struct { //nolint:golint,revive // Readability | |||
Log logr.Logger | |||
Recorder record.EventRecorder | |||
ApplicationsNamespace string | |||
Audiences []string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering if there's some other way of passing this config down the chain. Somewhat does not feel like this should belong here, it's unrelated to the reconcile at this level. Let's think about possible ways and sync tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved it! here
c99d6d7
Co-authored-by: Bartosz Majsak <[email protected]>
Co-authored-by: Bartosz Majsak <[email protected]>
@bartoszmajsak added some basic unit tests, not mocking the full functionality here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should stick to the correct terminology. We are providing canned responses, so this is stubbing, not mocking.
A bit more about it in this great blog post https://martinfowler.com/articles/mocksArentStubs.html
Co-authored-by: Bartosz Majsak <[email protected]>
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: bartoszmajsak The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
New changes are detected. LGTM label has been removed. |
I do not have too many opinions on this PR, @israel-hdez if you have time do a review on it? |
/retest-required |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving some questions.
I haven't tested. I'd test after having answers. This is, anyway, going to be OK for me as long as auth is still working (and I do see that it was already tested).
}, | ||
} | ||
|
||
if err = cli.Create(context.Background(), tokenReview, &client.CreateOptions{}); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to use the ServiceAccount mounted by the cluster, as a pivot to find the audientes.
I wonder if it isn't enough to parse the ServiceAccount token?
Just by parsing the one from CRC, I see the audiences claim in the payload:
"aud": [
"https://kubernetes.default.svc"
],
I'm not sure how it would be for ROSA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, originally we were using opaque token for the bash "workaround" and this solution is based on this, so perhaps, using SA (jwt) token aud is simpler and more efficient. We could replace this impl and use https://github.com/golang-jwt/jwt instead to fetch standard claims without verification (so passing nil
as the key func ). @israel-hdez @aslakknutsen do you think it's better to rely on the fact that SA token is JWT?
I just checked in ROSA and we can also parse JWT instead. For opaque (openshift oauth) tokens it's the same audience. We can get it by using the exact call to TokenReview API.
This behavior is only invoked when you do not specify anything in DSCI.ServiceMesh.Auth part. If you later do, it will update underlying config map which is now used by odh-model-controller
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@israel-hdez @aslakknutsen do you think it's better to rely on the fact that SA token is JWT?
AFAIK, core Kubernetes only supports JWT (read 1st sentence of this). So, I'd expect any SA token to be JWT. I understand we have control about the mounted/projected token so, as long as the setup does not change, we should be safe to assume it is a JWT.
About the opaque tokens, my understanding is that in core Kubernetes these are only supported when a 3rd party issues the token and the API Server would validate the token via a Webhook to the 3rd party (i.e. Kubernetes does not validate by its own). OpenShift's OAuth server may fall in this category.
return defaultAudiences | ||
} | ||
|
||
return tokenReview.Status.Audiences |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading these docs: https://kubernetes.io/docs/reference/access-authn-authz/authentication/.
There is a note that the audiences field is optional and that, if not present, it may still mean the token is valid for the cluster. So, I'm not sure if this is safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Underlying AuthConfig will be defaulted if not configured specifically.
// If omitted, Authorino will review tokens expecting the host name of the requested protected service amongst the audiences.
Current implementation defaults it to "https://kubernetes.default.svc"
if not specified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the docs that I linked, there is this text over status.audiences
of the TokenReview:
# If this is omitted, the token is considered to be valid to authenticate to the Kubernetes API server.
Underlying AuthConfig will be defaulted if not configured specifically.
Current implementation defaults it to "https://kubernetes.default.svc" if not specified.
IIRC, this default is on odh-model-controller side. This is OK.
My concern is that my understanding of the kubernetes docs is that the cluster may choose to omit the status.audiences
field as an indication that the token is valid for the API Server. Yet, in that case, I see no statement that the audience would be https://kubernetes.default.svc
in case the audiences field is omitted. Thus, we may wrongly assume it is https://kubernetes.default.svc
.
The writing is somewhat confusing. So, I may be understanding things badly.
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Let's keep it on hold until we figure out the right path how reliably fetch audiences or fall back if they're not present. |
@cam-garrison: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@cam-garrison @bartoszmajsak We saw an instance of this, and we'd prefer to have this PR merged soon, if possible. |
@israel-hdez can you elaborate more on what you mean by "instance" and what is the requirement that this PR can be part of? |
We saw in ROSA that the audience is not configured correctly. Thus, in the automated setup the inference requests do not go through for protected models. We, anyway, need docs for the manual setup, but it is a weird experience that for the automated setup users need to be careful of setting the right audience. |
@israel-hdez do we expect this PR for downstream 2.11? |
AFAIK, we don't have a target. Also, I'm not sure how much we support ROSA. If we do support it, we should try to have this in 2.11 if it fits... But I see this as best effort for 2.11. BTW, @Jooho provided some feedback in the Jira. |
Understood, but this is a known issue and the reason why |
https://issues.redhat.com/browse/RHOAIENG-4855
Description
This PR brings automation to populating the
audiences
field for token review. Previously, if the audiences needed were different than the default kubernetes.default.svc, the user had to configure it manually thru the AuthSpec in the DSCI.Now, if the user does not specify
audiences
in their DSCI, we will use anaudiences
value fetched while bootstrapping the operator. If the user specifies some non-default audiences, that value will be used over what is fetched while bootstrapping the operator.This specifically should make testing/deploying on ROSA easier, as previously we needed to manually fetch the audiences and set this in the DSCI, and now if you leave it as default the operator will fetch it for you.
How Has This Been Tested?
Tested on ROSA created thru clusterbot - created a DSCI without specifying audiences under
.spec.serviceMesh.auth.audiences
. Then, checked created ConfigMapauth-refs
to ensure thatAUTH_AUDIENCE
was populated with the correct audience for ROSA cluster, not the default. Then, did this basic test https://gist.github.com/israel-hdez/af374562ef9e5b9d80890aa6f0bce20d to ensure kserve functionality. (note - used odh-model-controller release v.0.12.0 in operator build to test this aspect).Also, tested with a specified audience to ensure that if the user specifies some
audiences
in the DSCI authspec, they are not overwritten.Merge criteria: