Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new: Deploy and monitor ML models with GPUs on Amazon EKS #1020

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
2 changes: 1 addition & 1 deletion cluster/eksctl/cluster.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ vpc:
publicAccess: true
addons:
- name: vpc-cni
version: 1.16.0
version: 1.18.3
configurationValues: '{"env":{"ENABLE_PREFIX_DELEGATION":"true", "ENABLE_POD_ENI":"true", "POD_SECURITY_GROUP_ENFORCING_MODE":"standard"},"enableNetworkPolicy": "true"}'
resolveConflicts: overwrite
managedNodeGroups:
Expand Down
3 changes: 2 additions & 1 deletion manifests/.workshop/terraform/outputs.tf
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,5 @@ output "environment" {
export ${k}='${v}'
%{endfor}
EOF
}
sensitive = true
}
1 change: 1 addition & 0 deletions manifests/manifests
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#!/bin/bash

kubectl delete ingress dogbooth -n dogbooth --ignore-not-found
kubectl delete rayservice dogbooth -n dogbooth --ignore-not-found
kubectl delete ns dogbooth --ignore-not-found

helm uninstall jupyterhub -n jupyterhub
helm uninstall nginx-ingress
helm uninstall kuberay-operator
kubectl delete ns jupyterhub

# Uninstall gpu-operator
GPU_OPERATOR_RELEASE_NAME=$(helm list -n gpu-operator -q)
helm uninstall $GPU_OPERATOR_RELEASE_NAME -n gpu-operator
kubectl delete ns gpu-operator
Loading