Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new: Deploy and monitor ML models with GPUs on Amazon EKS #1020

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
2 changes: 1 addition & 1 deletion cluster/eksctl/cluster.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ vpc:
publicAccess: true
addons:
- name: vpc-cni
version: 1.16.0
version: 1.18.3
configurationValues: '{"env":{"ENABLE_PREFIX_DELEGATION":"true", "ENABLE_POD_ENI":"true", "POD_SECURITY_GROUP_ENFORCING_MODE":"standard"},"enableNetworkPolicy": "true", "nodeAgent": {"enablePolicyEventLogs": "true"}}'
resolveConflicts: overwrite
managedNodeGroups:
Expand Down
1 change: 1 addition & 0 deletions lab/iam/policies/iam.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ Statement:
- eks.amazonaws.com
- eks-nodegroup.amazonaws.com
- eks-fargate.amazonaws.com
- scraper.aps.amazonaws.com
- guardduty.amazonaws.com
- spot.amazonaws.com
- fis.amazonaws.com
7 changes: 7 additions & 0 deletions lab/iam/policies/labs1.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,21 @@ Statement:
- Effect: Allow
Action:
- aps:DeleteWorkspace
- aps:DeleteScraper
- aps:Describe*
- aps:List*
- aps:QueryMetrics
- aps:CreateScraper
- aps:TagResource
Resource: ["*"]
Condition:
StringLike:
aws:ResourceTag/env:
- ${Env}*
- Effect: Allow
Action:
- aps:DescribeScraper
Resource: ["*"]
- Effect: Allow
Action:
- dynamodb:ListTables
Expand Down
4 changes: 2 additions & 2 deletions manifests/.workshop/terraform/base.tf
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@ terraform {
version = "2.15.0"
}
kubectl = {
source = "gavinbunney/kubectl"
version = "1.14.0"
source = "alekc/kubectl"
version = "2.0.4"
}
local = {
version = "2.5.1"
Expand Down
3 changes: 2 additions & 1 deletion manifests/.workshop/terraform/outputs.tf
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,5 @@ output "environment" {
export ${k}='${v}'
%{endfor}
EOF
}
sensitive = true
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
#!/bin/bash

kubectl delete ns dogbooth --ignore-not-found

uninstall-helm-chart jupyterhub jupyterhub
uninstall-helm-chart nginx-ingress default
uninstall-helm-chart kuberay-operator kuberay

kubectl delete ns jupyterhub --ignore-not-found

# Uninstall gpu-operator
uninstall-helm-chart gpu-operator gpu-operator

kubectl delete ns gpu-operator --ignore-not-found
Loading