Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test ISVC in a service mesh with Istio CNI plugin installed #354

Closed
DnPlas opened this issue Nov 28, 2023 · 3 comments
Closed

Test ISVC in a service mesh with Istio CNI plugin installed #354

DnPlas opened this issue Nov 28, 2023 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@DnPlas
Copy link
Contributor

DnPlas commented Nov 28, 2023

What needs to get done

Because of the Istio CNI plugin limitations, the Kserve InferenceServices (ISVC) may be affected by the network configuration, as each ISVC has a storage-initializer init-container, which executes this code, which may require network connectivity.

NOTE: this scenario is possible for other workloads, not just ISVCs, so the actual solution should be generic enough to cover all.

This task requires us to create a Kserve InferenceService inside an Istio mesh with the Istio CNI plugin enabled. Since it will most likely produce an error, we need to provide a solution and document it. An option for fixing this could be to add annotations as described here to all workload Pods, which will require us to change several components controllers/mutatingwebhookconfigs.

Relevant logs

# Not working init-container on a namespace with sidecar injection
ubuntu@charm-dev-jammy:~$ kubectl logs -nkubeflow-user-example-com sklearn-iris-predictor-00001-deployment-6f554fbd99-v6zzh -c storage-initializer
INFO:root:Initializing, args: src_uri [gs://kfserving-examples/models/sklearn/1.0/model] dest_path[ [/mnt/models]
INFO:root:Copying contents of gs://kfserving-examples/models/sklearn/1.0/model to local
WARNING:google.auth.compute_engine._metadata:Compute Engine Metadata server unavailable on attempt 1 of 3. Reason: [Errno 111] Connection refused
WARNING:google.auth.compute_engine._metadata:Compute Engine Metadata server unavailable on attempt 2 of 3. Reason: [Errno 111] Connection refused
WARNING:google.auth.compute_engine._metadata:Compute Engine Metadata server unavailable on attempt 3 of 3. Reason: [Errno 111] Connection refused
WARNING:google.auth._default:Authentication failed using Compute Engine authentication due to unavailable metadata server.

# Working init-container on a namespace w/o istio sidecar injection
ubuntu@charm-dev-jammy:~$ kubectl logs -nkserve-test sklearn-iris-predictor-00001-deployment-9fc88cc4f-bhv89 -c storage-initializer
INFO:root:Initializing, args: src_uri [gs://kfserving-examples/models/sklearn/1.0/model] dest_path[ [/mnt/models]
INFO:root:Copying contents of gs://kfserving-examples/models/sklearn/1.0/model to local
WARNING:google.auth.compute_engine._metadata:Compute Engine Metadata server unavailable on attempt 1 of 3. Reason: timed out
WARNING:google.auth.compute_engine._metadata:Compute Engine Metadata server unavailable on attempt 2 of 3. Reason: [Errno 113] No route to host
WARNING:google.auth.compute_engine._metadata:Compute Engine Metadata server unavailable on attempt 3 of 3. Reason: timed out
WARNING:google.auth._default:Authentication failed using Compute Engine authentication due to unavailable metadata server.
INFO:root:Downloading: /mnt/models/model.joblib
INFO:root:Successfully copied gs://kfserving-examples/models/sklearn/1.0/model to /mnt/models

DOD:

  • Documentation with a workaround for the issue

Why it needs to get done

To avoid potential issues when we enable the Istio CNI plugin.

@DnPlas DnPlas added the enhancement New feature or request label Nov 28, 2023
Copy link

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-5050.

This message was autogenerated

@DnPlas
Copy link
Contributor Author

DnPlas commented Nov 29, 2023

This issue can be workaround by adding the right annotations to the Pod that eventually gets created via the InferenceService definition. Something like this should help:

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "sklearn-iris-workaround"
  annotations:
    traffic.sidecar.istio.io/excludeOutboundIPRanges: "0.0.0.0/0"
spec:
...

This workaround is provided in the official documentation and has also been tested in upstream kubeflow/manifests#2014 (comment).

Please note this is a workaround and a real solution should be something that applies to any workload, this effort will be tracked in #356

@kimwnasptd
Copy link
Contributor

Closing this issue, since we tested the ISVC. As mentioned above we can then proceed with evaluating Charming kyverno in the future.

@DnPlas DnPlas added level 1 Level 1 type of issue good first issue Good for newcomers enhancement New feature or request and removed enhancement New feature or request good first issue Good for newcomers level 1 Level 1 type of issue labels Mar 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

No branches or pull requests

2 participants