Skip to content
This repository has been archived by the owner on Aug 28, 2024. It is now read-only.

Optimize for BYO cluster: create outside process to manage the creation of our KSA-bound cloud IAM principals (sci-${provider}) #178

Open
brandonjbjelland opened this issue Aug 10, 2023 · 2 comments

Comments

@brandonjbjelland
Copy link
Contributor

brandonjbjelland commented Aug 10, 2023

What's the issue?

We want to unbundle some parts of the aws and gcp install process such that a user bringing their own closer can identify the right entrypoint and setup the rest of their environment to get started using substratus. The IAM principal + role bindings is a good place to start. The bucket, registry, also fit in this camp. Daemonset and other cluster-wide dependencies are a final category.

Why make this change?

Discussed here.

This change would allow us to optimize for users who bring a well-configured cluster, unlikely as that might be (node pools/groups, daemonsets, an appropriate storage driver).

As a larger point, I think this makes a case to unbundle many parts of the provider spin up process such that those bits become reusable to folks entering at different points. The helm install of nvidia-device-plugin and karpenter on AWS are already well-positioned to be broken out as independent install scripts and called by aws-up.sh.

Related: #112

@BOsterbuhr
Copy link

Let's use me as one extreme example; I have an EKS cluster, S3 bucket, RDS instance, Karpenter, and nvidia-device-plugin installed. Is there any more infrastructure I need to provision, or is it only IAM related configuration left before I can apply the config namespace and system yaml files?

@samos123
Copy link
Contributor

That's the end goal here, but note that we're not there yet. You're spot on about what you would need except the RDS instance, we don't use any RDS nor planning to in the short-term. You would indeed only need create an IAM role with enough permissions and allow K8s SA in to assume that role. Brandon and I are working on implementing this proposal for both AWS and GCP: https://github.com/substratusai/substratus/blob/main/docs/proposals/operator-managed-infra.md

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants