Skip to content

Commit

Permalink
fix(scripts/gcp): latest GPU drivers for L4 support and Pytorch>=2.1
Browse files Browse the repository at this point in the history
  • Loading branch information
nkemnitz committed Jan 3, 2024
1 parent ad74ad9 commit 9949c72
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion scripts/gcp/create_corgie_igneous_cluster.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@

ADD_WORKLOAD_IDENTITY_TMPL = 'gcloud container clusters get-credentials {CLUSTER_NAME} --region {REGION} --project {PROJECT_NAME} && gcloud iam service-accounts create {CLUSTER_NAME}-worker --project={PROJECT_NAME} && gcloud projects add-iam-policy-binding {PROJECT_NAME} --member "serviceAccount:{CLUSTER_NAME}-worker@{PROJECT_NAME}.iam.gserviceaccount.com" --role "roles/storage.objectUser" && gcloud projects add-iam-policy-binding {PROJECT_NAME} --member "serviceAccount:{CLUSTER_NAME}-worker@{PROJECT_NAME}.iam.gserviceaccount.com" --role "roles/artifactregistry.reader" && gcloud iam service-accounts add-iam-policy-binding {CLUSTER_NAME}-worker@{PROJECT_NAME}.iam.gserviceaccount.com --role roles/iam.workloadIdentityUser --member "serviceAccount:{PROJECT_NAME}.svc.id.goog[default/default]" --project {PROJECT_NAME} && kubectl annotate serviceaccount default --namespace default iam.gke.io/gcp-service-account={CLUSTER_NAME}-worker@{PROJECT_NAME}.iam.gserviceaccount.com'

CONFIGURE_DRIVERS_COMMAND_TMPL = "gcloud container clusters get-credentials {CLUSTER_NAME} --region {REGION} --project {PROJECT_NAME}; kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/container-engine-accelerators/master/nvidia-driver-installer/cos/daemonset-preloaded.yaml"
CONFIGURE_DRIVERS_COMMAND_TMPL = "gcloud container clusters get-credentials {CLUSTER_NAME} --region {REGION} --project {PROJECT_NAME}; kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/container-engine-accelerators/master/nvidia-driver-installer/cos/daemonset-preloaded-latest.yaml"


def main(): # pylint: disable=too-many-statements
Expand Down

0 comments on commit 9949c72

Please sign in to comment.