-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add oauth-proxy to rawdeployments if odh auth label is present #419
base: master
Are you sure you want to change the base?
add oauth-proxy to rawdeployments if odh auth label is present #419
Conversation
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: VedantMahabaleshwarkar The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/kube_ingress_reconciler.go
Fixed
Show fixed
Hide fixed
360560d
to
67041cc
Compare
tests passing locally, the failures look like test infra failures /retest-required |
67041cc
to
2cce5bc
Compare
44ba660
to
da8dc3a
Compare
/retest-required |
not sure what's happening with the tests, they are passing locally :( |
pkg/controller/v1beta1/inferenceservice/reconcilers/deployment/deployment_reconciler.go
Outdated
Show resolved
Hide resolved
pkg/controller/v1beta1/inferenceservice/reconcilers/service/service_reconciler.go
Show resolved
Hide resolved
pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/kube_ingress_reconciler.go
Outdated
Show resolved
Hide resolved
pkg/controller/v1beta1/inferenceservice/reconcilers/deployment/deployment_reconciler.go
Outdated
Show resolved
Hide resolved
`--cookie-secret=SECRET`, | ||
`--openshift-delegate-urls={"/": {"namespace": "$isvcNamespace", "resource": "services", "verb": "get"}}`, | ||
`--openshift-sar={"namespace": "$isvcNamespace", "resource": "services", "verb": "get"}`, | ||
`--skip-auth-regex="(^/metrics|^/apis/v1beta1/healthz)"`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the liveness and readiness probes, instead of being fixed, you should inspect the ISVC and extract the configured paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't get why we should do this. For the modelmesh oauth proxy injection the oauth proxy probes are fixed. Even here the kserve container and the oauth proxy container will report statuses independently so I'm not seeing why we should change the probes here
pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/kube_ingress_reconciler.go
Outdated
Show resolved
Hide resolved
the e2e raw test failure is because https://github.com/opendatahub-io/kserve/blob/master/test/scripts/openshift-ci/run-e2e-tests.sh needs to be updated with the new behavior. Imo it can be ignored now can will be fixed later in https://issues.redhat.com/browse/RHOAIENG-14604 |
pkg/controller/v1beta1/inferenceservice/reconcilers/deployment/deployment_reconciler.go
Outdated
Show resolved
Hide resolved
pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/kube_ingress_reconciler.go
Show resolved
Hide resolved
`--cookie-secret=SECRET`, | ||
`--openshift-delegate-urls={"/": {"namespace": "` + namespace + `", "resource": "inferenceservices", "group": "serving.kserve.io", "name": "` + isvc + `", "verb": "get"}}`, | ||
`--openshift-sar={"namespace": "` + namespace + `", "resource": "inferenceservices", "group": "serving.kserve.io", "name": "` + isvc + `", "verb": "get"}`, | ||
`--skip-auth-regex="(^/metrics|^/apis/v1beta1/healthz)"`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following up #419 (comment).
These are not the probes, but paths that would be left unprotected.
Now... I'm thinking that you are right on your comment that Kubernetes is going to do the probing on the container. With that in mind, I'm not sure why we should have unprotected paths. So, maybe, remove the --skip-auth-regex
argument?
`--upstream=http://localhost:` + upstreamPort, | ||
`--tls-cert=/etc/tls/private/tls.crt`, | ||
`--tls-key=/etc/tls/private/tls.key`, | ||
`--cookie-secret=SECRET`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, follow-up my comment of the previous review round: #419 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't get this comment. The tls cert and key is mounted from the secret which is created by the annotation for the service. That should be unique to each deployment right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TLS args are good. My comment is for the --cookie-secret=SECRET
argument.
pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/kube_ingress_reconciler.go
Show resolved
Hide resolved
pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/kube_ingress_reconciler.go
Outdated
Show resolved
Hide resolved
pkg/controller/v1beta1/inferenceservice/reconcilers/service/service_reconciler.go
Outdated
Show resolved
Hide resolved
ports := service.Spec.Ports | ||
replaced := false | ||
for i, port := range ports { | ||
if port.Port == constants.CommonDefaultHttpPort { | ||
ports[i] = httpsPort | ||
replaced = true | ||
} | ||
} | ||
if !replaced { | ||
ports = append(ports, httpsPort) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From
kserve/pkg/controller/v1beta1/inferenceservice/reconcilers/service/service_reconciler.go
Lines 70 to 110 in 4add27b
if len(container.Ports) > 0 { | |
var servicePort corev1.ServicePort | |
servicePort = corev1.ServicePort{ | |
Name: container.Ports[0].Name, | |
Port: constants.CommonDefaultHttpPort, | |
TargetPort: intstr.IntOrString{ | |
Type: intstr.Int, | |
IntVal: container.Ports[0].ContainerPort, | |
}, | |
Protocol: container.Ports[0].Protocol, | |
} | |
servicePorts = append(servicePorts, servicePort) | |
for i := 1; i < len(container.Ports); i++ { | |
port := container.Ports[i] | |
if port.Protocol == "" { | |
port.Protocol = corev1.ProtocolTCP | |
} | |
servicePort = corev1.ServicePort{ | |
Name: port.Name, | |
Port: port.ContainerPort, | |
TargetPort: intstr.IntOrString{ | |
Type: intstr.Int, | |
IntVal: port.ContainerPort, | |
}, | |
Protocol: port.Protocol, | |
} | |
servicePorts = append(servicePorts, servicePort) | |
} | |
} else { | |
port, _ := strconv.Atoi(constants.InferenceServiceDefaultHttpPort) | |
servicePorts = append(servicePorts, corev1.ServicePort{ | |
Name: componentMeta.Name, | |
Port: constants.CommonDefaultHttpPort, | |
TargetPort: intstr.IntOrString{ | |
Type: intstr.Int, | |
IntVal: int32(port), // #nosec G109 | |
}, | |
Protocol: corev1.ProtocolTCP, | |
}) | |
} |
CommonDefaultHttpPort
entry should be present always, and it is going to be the first entry always. So, I think you can simply mutate service.Spec.Ports[0]
and change the target port...
...sorry, I know that I said about being defensive in my previous comment.
pkg/controller/v1beta1/inferenceservice/reconcilers/deployment/deployment_reconciler.go
Show resolved
Hide resolved
@@ -1264,6 +1270,109 @@ var _ = Describe("v1beta1 inference service controller", func() { | |||
Eventually(func() error { return k8sClient.Get(context.TODO(), predictorHPAKey, actualHPA) }, timeout). | |||
Should(HaveOccurred()) | |||
}) | |||
It("Should have no ingress created if labeled as cluster-local", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps, contribute this one to upstream?
pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go
Outdated
Show resolved
Hide resolved
expectedService := &v1.Service{ | ||
ObjectMeta: metav1.ObjectMeta{ | ||
Name: predictorServiceKey.Name, | ||
Namespace: predictorServiceKey.Namespace, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add unit tests for the routes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The route logic is checking if the route is admitted
and we will only use the route URL if that is the case. Since ODH Model Controller is the one creating the route here shouldn't this be a integ test more than a unit test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then, perhaps you should write an E2E test under test/e2e.
The E2Es are currently failing because of a bad openshift-ci setup. But once the setup is finished (we do have Jiras in the backlog for fixing it), the E2Es will run with odh-model-controller deployed on the testing cluster. In the meanwhile, we have been running E2Es on our machines.
ports := service.Spec.Ports | ||
replaced := false | ||
for i, port := range ports { | ||
if port.Port == constants.CommonDefaultHttpPort { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the CommonDefaultHttpPort
is going away, I think you should reflect in the Status of the ISVC the right port to reach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On this review round, I don't remember seeing a change that would solve this comment.
Signed-off-by: Vedant Mahabaleshwarkar <[email protected]>
Signed-off-by: Vedant Mahabaleshwarkar <[email protected]>
Signed-off-by: Vedant Mahabaleshwarkar <[email protected]>
Signed-off-by: Vedant Mahabaleshwarkar <[email protected]>
Signed-off-by: Vedant Mahabaleshwarkar <[email protected]>
Signed-off-by: Vedant Mahabaleshwarkar <[email protected]>
Signed-off-by: Vedant Mahabaleshwarkar <[email protected]>
Signed-off-by: Vedant Mahabaleshwarkar <[email protected]> 1 Signed-off-by: Vedant Mahabaleshwarkar <[email protected]> 2 Signed-off-by: Vedant Mahabaleshwarkar <[email protected]>
693027b
to
8028a3b
Compare
Signed-off-by: Vedant Mahabaleshwarkar <[email protected]>
Signed-off-by: Vedant Mahabaleshwarkar <[email protected]>
Signed-off-by: Vedant Mahabaleshwarkar <[email protected]>
pkg/controller/v1beta1/inferenceservice/reconcilers/raw/raw_kube_reconciler.go
Outdated
Show resolved
Hide resolved
Co-authored-by: Filippe Spolti <[email protected]> Signed-off-by: Vedant Mahabaleshwarkar <[email protected]>
@VedantMahabaleshwarkar: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
oauthMemoryRequest = str | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fallback values are fine, but:
- All other code in KServe returns an error if there is a failure fetching the ConfigMap.
- Also, all other code in KServe returns an error if there is an error unmarshalling the JSON string.
So, I'm not seeing a good reason for not following the same pattern as the rest of the KServe code. IMO, we should propagate the errors to the user.
`--upstream=http://localhost:` + upstreamPort, | ||
`--tls-cert=/etc/tls/private/tls.crt`, | ||
`--tls-key=/etc/tls/private/tls.key`, | ||
`--cookie-secret=SECRET`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TLS args are good. My comment is for the --cookie-secret=SECRET
argument.
@@ -19,10 +19,10 @@ package ingress | |||
import ( | |||
"context" | |||
"fmt" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The removal of the blank line may be causing the linter issue.
routeUrl, err := getRouteURLIfExists(r.client, isvc) | ||
if err == nil && routeUrl != nil { | ||
isvc.Status.URL = routeUrl | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only error that is tolerable to ignore is the not found
error, which is taken in account inside getRouteURLIfExists
.
For any other error we are expecting to properly report the endpont of the Route. So, I don't see a good reason for ignoring errors, as that can lead to assigning a bad endpoint.
@@ -305,6 +357,12 @@ func (r *RawIngressReconciler) Reconcile(isvc *v1beta1.InferenceService) error { | |||
} | |||
} | |||
isvc.Status.URL, err = createRawURL(isvc, r.ingressConfig) | |||
if val, ok := isvc.Labels[constants.NetworkVisibility]; ok && val == constants.ODHRouteEnabled { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code inside the if
is what should be coded in a deterministic way.
The odh-model-controller
would be creating the Route, so the order is not guaranteed. There is a chance that kserve-controller finishes reconciliation while odh-model-controller hasn't had the opportunity of creating the Route. That can lead to a not found
error when fetching the route, leading to the ISVC reporting the wrong endpoint.
ports := service.Spec.Ports | ||
replaced := false | ||
for i, port := range ports { | ||
if port.Port == constants.CommonDefaultHttpPort { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On this review round, I don't remember seeing a change that would solve this comment.
expectedService := &v1.Service{ | ||
ObjectMeta: metav1.ObjectMeta{ | ||
Name: predictorServiceKey.Name, | ||
Namespace: predictorServiceKey.Namespace, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then, perhaps you should write an E2E test under test/e2e.
The E2Es are currently failing because of a bad openshift-ci setup. But once the setup is finished (we do have Jiras in the backlog for fixing it), the E2Es will run with odh-model-controller deployed on the testing cluster. In the meanwhile, we have been running E2Es on our machines.
What this PR does / why we need it:
This PR adds the following :
"security.opendatahub.io/enable-auth" = "true"
-- an oauth proxy container is added to the deployment
-- http port is replaced by https port in the service
--
"service.beta.openshift.io/serving-cert-secret-name"
is added to the service to allow creation of the tls secretWhich issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes # https://issues.redhat.com/browse/RHOAIENG-10291, https://issues.redhat.com/browse/RHOAIENG-13444
TEST WITH: opendatahub-io/odh-model-controller#274
Type of changes
Please delete options that are not relevant.
Feature/Issue validation/testing:
To test enabling route for a particular ISVC
-- Create any kserve isvc+SR
-- isvc should have annotation :
"serving.kserve.io/deploymentMode" : RawDeployment
-- isvc should have label :
"networking.kserve.io/visibility": "enable-route"
To test route with auth:
-- Create any kserve isvc+SR
-- isvc should have annotation :
"serving.kserve.io/deploymentMode" : RawDeployment
-- isvc should have label:
"security.opendatahub.io/enable-auth" = "true"
-- isvc should have label:
"networking.kserve.io/visibility": "enable-route"
To test Inference without auth:
-- remove isvc label
"security.opendatahub.io/enable-auth" = "true"
Now inference should work without token
Checklist:
Release note:
Re-running failed tests
/rerun-all
- rerun all failed workflows./rerun-workflow <workflow name>
- rerun a specific failed workflow. Only one workflow name can be specified. Multiple /rerun-workflow commands are allowed per comment.