Skip to content
This repository has been archived by the owner on Aug 28, 2024. It is now read-only.

adds AWS infra for instantiating and destroying all baseline substratus components #170

Merged
merged 22 commits into from
Aug 10, 2023

Conversation

brandonjbjelland
Copy link
Contributor

@brandonjbjelland brandonjbjelland commented Aug 6, 2023

What is this change?

This change adds a script that we (and users) can use to create a complete substratus environment on a new AWS account.

Why make this change?

#12

Caveats, questions, TBD

Q: Are the local/ephemeral SSDs important for any sort of workload we support? Trying to understand if that's critical here too. - size is important but local SSDs probably not. Leaving this out for now.

Unknown: I added karpenter but haven't throughly tested how well it works at auto-provisioning. GPU backed noe

Caveat: The features we get enabled on a GKE cluster through simple flags are incredibly fussy on EKS and I don't have them working. I'll dig further (maybe I've missed some events) but I think we might need to manage these on our own. e.g. I've never seen these fail on GKE:

~Equivalent features baked into EKS configuration fail consistently. I've seen timeouts across coredns, vpc-cni, ebs-csi regardless of the order of deployment or how I do it. So far I've tried:

  1. using the EKS module flags
  2. breaking out into eks_add_on resources (docs) and
  3. using a purpose-built module.

They all fail. We need to attach some of the IRSA roles created in the code here to those resources however we instantiate them so if that's out of scope of terraform, some additional outputs will be necessary (i.e., arns of the roles).~

With time, I'll add to this...

  1. an equivalent to gcp-up.sh and gcp-down.sh ✅
  2. an aws-up Dockerfile ✅
  3. make targets ✅
  4. docker hub push actions ✅ - not needed. rolled into same container image

@BOsterbuhr
Copy link

Hey @brandonjbjelland and team, I had a quick thought I wanted to pass along so I don't forget about it.

Need to explore further if this configuration will actually work or if it's node auto-provisioning is smart enough in this case. I suspect EKS is not going to make good instance type/accelerator choices on our behalf.

Karpenter may be a good option to get the intelligent auto-provisioning you are looking for. Karpenter docs on provisioners.
The install should be easy if you use the blueprints addons module but as you mentioned the addons aren't always so straightforward. You could always do a helm install of Karpenter if the addon path doesn't work.

I'll double-check what I ended up doing to get coredns, vpc-cni, and ebs-csi working consistently in my terraform.

I am looking forward to testing Substratus out on Monday.

@samos123
Copy link
Contributor

samos123 commented Aug 6, 2023

Are the local/ephemeral SSDs important for any sort of workload we support? Trying to understand if that's critical here too.

You got a point there, I don't see a critical need yet, so we can skip them for now.

@samos123
Copy link
Contributor

samos123 commented Aug 6, 2023

Spinning up EKS clusters with Terraform is such a pain, maybe by design? I wasn't expecting it to be so complex. The eksctl tool does seem to make it easier:
https://eksctl.io/usage/gpu-support/
https://eksctl.io/usage/eksctl-karpenter/

Not sure if it will fit all our needs or takes away too much flexibility. I personally prefer terraform but seeing the struggle and complexity of EKS, it might be fine to consider something like eksctl for development. Note I have never tried eksctl myself so it could be trash.

In the end we expect most users to already have a K8s cluster when they want to use Substratus and those end-users will choose their own tooling of choice to manage EKS + nodegroups. So the main purpose of the bundled installer/EKS cluster creator is mostly development and PoC phase to get rolling quickly with minimal issues.

@brandonjbjelland
Copy link
Contributor Author

brandonjbjelland commented Aug 6, 2023

@BOsterbuhr thanks for the pointer! Karpenter looks viable here and could very well be a common autoscaler we could use across other providers. but not useful outside of AWS: kubernetes-sigs/karpenter#741

Sidebar: if you're hoping to use substratus on AWS, we're just getting started on adding support. Running on GCP is the paved path we have today.

@BOsterbuhr
Copy link

In my experience if you are just trying to get dev/PoC support for AWS quickly then eksctl is good choice.

As for Karpenter, yeah unfortunately AWS doesn't seem to be in a rush to support other cloud providers.

Sidebar: I'll test on GCP first to get a better understanding of everything and will watch for AWS support.

@brandonjbjelland brandonjbjelland marked this pull request as draft August 7, 2023 09:28
@brandonjbjelland brandonjbjelland changed the title adds an AWS module for instantiating all substratus components + an optional VPC WIP: adds an AWS module for instantiating all substratus components + an optional VPC Aug 7, 2023
@brandonjbjelland
Copy link
Contributor Author

In my experience if you are just trying to get dev/PoC support for AWS quickly then eksctl is good choice.

We discussed internally today and arrived at this same consensus. In short, delivering a cluster is not our value add, where we should spend time, or put in a lot of code (which inevitably will rot, have feature requests itself, etc.). We just need the simplest possible way to get a minimum cluster up in each supported provider for someone starting at zero - that may or may not be through terraform. Here it seems eksctl is the right-sized approach. 👍

As for Karpenter, yeah unfortunately AWS doesn't seem to be in a rush to support other cloud providers.

Though we're being very careful with the dependencies we take on, I don't know that it changes our calculation here - karpenter def presents itself as the simplest way to auto-scale EKS on hydrogenous on hardware with minimal overhead. We very much feel incentivized to avoid having a long list of node groups for the different flavors of GPU-supported instances if that's our alternative.

Sidebar: I'll test on GCP first to get a better understanding of everything and will watch for AWS support.

Thank you! 🙏 Any feedback is highly appreciated, @BOsterbuhr ! ❤️

@brandonjbjelland brandonjbjelland changed the title WIP: adds an AWS module for instantiating all substratus components + an optional VPC adds a getting-started AWS script for instantiating and destroying all substratus components + a VPC Aug 8, 2023
@brandonjbjelland brandonjbjelland force-pushed the feat/add-aws-infra branch 4 times, most recently from 6c0458a to b90e48f Compare August 8, 2023 22:48
install/Dockerfile Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
install/Dockerfile Outdated Show resolved Hide resolved
install/scripts/aws-up.sh Show resolved Hide resolved
install/scripts/aws-up.sh Outdated Show resolved Hide resolved
install/scripts/aws-up.sh Show resolved Hide resolved
install/scripts/aws-up.sh Outdated Show resolved Hide resolved
install/kubernetes/aws/eks-cluster.yaml.tpl Outdated Show resolved Hide resolved
install/scripts/aws-up.sh Outdated Show resolved Hide resolved
nstogner
nstogner previously approved these changes Aug 9, 2023
Copy link
Contributor

@nstogner nstogner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff! Added some comments, but none of them are big deals

samos123
samos123 previously approved these changes Aug 10, 2023
@samos123
Copy link
Contributor

Karpenter seems to have Karpenter specific node labels you have to use in your pod spec. This might require some more design discussion and require making use_karpenter a flag or expose it on Substratus resources somehow?

@BOsterbuhr
Copy link

Karpenter seems to have Karpenter specific node labels you have to use in your pod spec.

Can you explain what you mean? Your pods shouldn't have to even know Karpenter exists.

@samos123
Copy link
Contributor

samos123 commented Aug 10, 2023

Let's say you want to run a pod on A100 GPU then how would you ensure the pod gets scheduled on a node that has A100 GPU on Karpenter vs non Karpenter? You might have nodes with T4, V100 and A100 in the same cluster.

Note I might be totally wrong since I haven't used Karpenter myself. I was reading this: https://karpenter.sh/preview/concepts/scheduling/

That doc made me believe in order for Karpenter to create a node that has A100 I would have to set nodeSelector in the pod to karpenter.k8s.aws/instance-gpu-name = a100 OR use affinity rules.

I got a GCP background so this is my first time seriously looking into Karpenter. For reference, I'm hoping there is a label like cloud.google.com/gke-accelerator in EKS that works for both Karpenter and non-Karpenter: https://cloud.google.com/kubernetes-engine/docs/how-to/gpus#multiple_gpus

@BOsterbuhr
Copy link

Let's say you want to run a pod on A100 GPU then how would you ensure the pod gets scheduled on a node that has A100 GPU on Karpenter vs non Karpenter? You might have nodes with T4, V100 and A100 in the same cluster.

Oh ok, that makes sense, so instead of making the end user understand that you are using Karpenter you could just create a provisioner with a User-Defined Label as a requirement and then have your end-user use that label as a node selector.
Another requirement you would have then in that provisioner would be karpenter.k8s.aws/instance-gpu-name = a100
I am far from an expert in Karpenter but I believe you ideally want to keep most things at the provisioner level.

@samos123
Copy link
Contributor

The issue is that there doesn't seem to be a node label that exposes the GPU type on AWS unless you use Karpenter, however at the same time we also don't want to depend on Karpenter and ensure Substratus works well without Karpenter. A key principle of Substratus is to minimize dependencies so it's easier to get Substratus to run in any EKS cluster.

Actually, I might be wrong all together and should just get a GPU node on EKS to verify myself. Seems there is in fact a label that would have the info we're looking for: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/aws_cloud_provider.go#L38C28-L38C31

@BOsterbuhr
Copy link

Actually, I might be wrong all together and should just get a GPU node on EKS to verify myself. Seems there is in fact a label that would have the info we're looking for

yeah you're right it looks like that is where the google label you mentioned is coming from as well https://github.com/kubernetes/autoscaler/blob/fc5870f8eaf850dd1e18a5884a7491168dc5d8a0/cluster-autoscaler/cloudprovider/gce/gce_cloud_provider.go#L37

The only issue I could see is one I think you all brought up previously; when using the Kubernetes autoscaler you have to manage a separate node group of each different instance type. But that may be worth it if you don't want any dependencies.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants