Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can not install HAMi on AWS EKS #744

Open
cataglyphis opened this issue Dec 26, 2024 · 5 comments
Open

can not install HAMi on AWS EKS #744

cataglyphis opened this issue Dec 26, 2024 · 5 comments

Comments

@cataglyphis
Copy link

Please provide an in-depth description of the question you have:

What do you think about this question?:

Environment:

  • HAMi version: 2.4.1
  • Kubernetes version: v1.29.11-eks-56e63d8
  • Others:
helm install hami hami-charts/hami --set scheduler.kubeScheduler.imageTag=v1.29.11 -n hami
Error: INSTALLATION FAILED: chart requires kubeVersion: >= 1.16.0 which is incompatible with Kubernetes v1.29.11-eks-56e63d8
@elrondwong
Copy link
Contributor

The version v1.29.11-eks-56e63d8 is not a standard semantic version, so it cannot be recognized by Helm. To resolve this issue, there are two approaches:

  • Directly remove or comment out the kubeVersion field in charts/hami/Chart.yaml
# kubeVersion: ">= 1.16.0"
  • Use --set kubeVersionOverride=true when installing the Helm Chart to bypass the Kubernetes version check.
helm install hami hami-charts/hami --set kubeVersionOverride=true

@cataglyphis
Copy link
Author

The version v1.29.11-eks-56e63d8 is not a standard semantic version, so it cannot be recognized by Helm. To resolve this issue, there are two approaches:

  • Directly remove or comment out the kubeVersion field in charts/hami/Chart.yaml
# kubeVersion: ">= 1.16.0"
  • Use --set kubeVersionOverride=true when installing the Helm Chart to bypass the Kubernetes version check.
helm install hami hami-charts/hami --set kubeVersionOverride=true

Since there is no user-managed kube-scheduler component in AWS EKS clusters, should I install the HAMi kube-scheduler or only the vgpu-scheduler-extender?

@elrondwong
Copy link
Contributor

I have not installed HAMI on AWS EKS before. Normally, both components need to be installed:

  • kube-scheduler will register a scheduler extender with the cluster.
  • vgpu-scheduler-extender handles the actual scheduling requests.

@cataglyphis
Copy link
Author

The version v1.29.11-eks-56e63d8 is not a standard semantic version, so it cannot be recognized by Helm. To resolve this issue, there are two approaches:

  • Directly remove or comment out the kubeVersion field in charts/hami/Chart.yaml
# kubeVersion: ">= 1.16.0"
  • Use --set kubeVersionOverride=true when installing the Helm Chart to bypass the Kubernetes version check.
helm install hami hami-charts/hami --set kubeVersionOverride=true

❯ helm install hami ./hami -n hami --set scheduler.kubeScheduler.imageTag=v1.29.11 --set kubeVersionOverride=true
Error: INSTALLATION FAILED: chart requires kubeVersion: >= 1.16.0 which is incompatible with Kubernetes v1.29.11-eks-56e63d8

@cataglyphis
Copy link
Author

cataglyphis commented Dec 27, 2024

Looks like HAMi not works well with AWS EKS cluster.

NAME                              READY   STATUS             RESTARTS      AGE
hami-admission-patch-4shvn        0/1     CrashLoopBackOff   4 (22s ago)   113s
hami-scheduler-69fbc99df7-6x87p   1/2     Error              4 (64s ago)   119s
  ----     ------     ----                    ----               -------
  Normal   Scheduled  9m1s                    default-scheduler  Successfully assigned hami/hami-scheduler-69fbc99df7-6x87p to ip-10-170-85-172.us-west-2.compute.internal
  Normal   Pulling    9m1s                    kubelet            Pulling image "registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.29.11"
  Normal   Pulled     8m50s                   kubelet            Successfully pulled image "registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.29.11" in 10.5s (10.5s including waiting)
  Normal   Pulled     8m50s                   kubelet            Container image "projecthami/hami:v2.4.1" already present on machine
  Normal   Created    8m50s                   kubelet            Created container vgpu-scheduler-extender
  Normal   Started    8m50s                   kubelet            Started container vgpu-scheduler-extender
  Normal   Created    8m6s (x4 over 8m50s)    kubelet            Created container kube-scheduler
  Normal   Started    8m6s (x4 over 8m50s)    kubelet            Started container kube-scheduler
  Normal   Pulled     7m12s (x4 over 8m49s)   kubelet            Container image "registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.29.11" already present on machine
  Warning  BackOff    3m59s (x23 over 8m48s)  kubelet            Back-off restarting failed container kube-scheduler in pod hami-scheduler-69fbc99df7-6x87p_hami(93fb46b5-13a4-4cd6-9df4-124c311a45fb)
Defaulted container "kube-scheduler" out of: kube-scheduler, vgpu-scheduler-extender
Usage:
  kube-scheduler [flags]

Misc flags:

      --config string            The path to the configuration file.
      --master string            The address of the Kubernetes API server (overrides any value in kubeconfig)
      --write-config-to string   If set, write the configuration values to this file and exit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants