You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(base) [email protected]:/Users/ottodeng/aks $ kubectl get nodeclaims.karpenter.sh
No resources found
(base) [email protected]:/Users/ottodeng/aks $ kubectl get events -A --field-selector source=karpenter
No resources found
(base) [email protected]:/Users/ottodeng/aks $
(base) [email protected]:/Users/ottodeng/aks $ kubectl get pod
NAME READY STATUS RESTARTS AGE
pix-88d6475c8-kt8gb 0/1 Pending 0 7m54s
(base) [email protected]:/Users/ottodeng/aks $ kubectl describe pod pix-88d6475c8-kt8gb
Name: pix-88d6475c8-kt8gb
Namespace: default
Priority: 0
Service Account: default
Node:
Labels: app=samples-tf-mnist-demo
pod-template-hash=88d6475c8
Annotations:
Status: Pending
IP:
IPs:
Controlled By: ReplicaSet/pix-88d6475c8
Containers:
samples-tf-mnist-demo:
Image: mcr.microsoft.com/azuredocs/samples-tf-mnist-demo:gpu
Port:
Host Port:
Args:
--max_steps
50000
Limits:
nvidia.com/gpu: 1
Requests:
nvidia.com/gpu: 1
Environment:
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-f7tj8 (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
kube-api-access-f7tj8:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
nvidia.com/gpu:NoSchedule op=Exists
pool-type=t4:NoSchedule
sku=gpu:NoSchedule
Events:
Type Reason Age From Message
Warning FailedScheduling 2m37s (x4 over 8m7s) default-scheduler 0/3 nodes are available: 3 Insufficient nvidia.com/gpu. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod..
(base) [email protected]:/Users/ottodeng/aks $
Duplicate of #199, we are in the process of rolling out this support right now for node auto provisioning. Self-hosted karpenter has support for this change already. Thank you for your patience!
Version
az aks create --name pix-nap-sg03 -l southeastasia
--resource-group pix-nap
--node-provisioning-mode Auto
--network-plugin-mode overlay
--network-dataplane cilium
--max-pods 110 --node-count 3
--service-cidr 10.97.96.0/20 --dns-service-ip 10.97.96.10
--network-plugin azure --enable-managed-identity
--node-osdisk-size 512 --node-vm-size Standard_D2s_v4
--vnet-subnet-id /subscriptions/2a5e7fa2-d528-4c66-9e6a-3a3a0f290e98/resourcegroups/pix-nap/providers/Microsoft.Network/virtualNetworks/aks-nap-vnet-junfeng/subnets/aks
--ssh-key-value ~/.ssh/id_rsa.pub
--tier premium --k8s-support-plan AKSLongTermSupport --kubernetes-version 1.27 --yes
Expected Behavior
nodeclaim will created success.
Actual Behavior
no any nodeclaim create.
(base) [email protected]:/Users/ottodeng/aks $ kubectl get nodeclaims.karpenter.sh
No resources found
(base) [email protected]:/Users/ottodeng/aks $ kubectl get events -A --field-selector source=karpenter
No resources found
(base) [email protected]:/Users/ottodeng/aks $
(base) [email protected]:/Users/ottodeng/aks $ kubectl get pod
NAME READY STATUS RESTARTS AGE
pix-88d6475c8-kt8gb 0/1 Pending 0 7m54s
(base) [email protected]:/Users/ottodeng/aks $ kubectl describe pod pix-88d6475c8-kt8gb
Name: pix-88d6475c8-kt8gb
Namespace: default
Priority: 0
Service Account: default
Node:
Labels: app=samples-tf-mnist-demo
pod-template-hash=88d6475c8
Annotations:
Status: Pending
IP:
IPs:
Controlled By: ReplicaSet/pix-88d6475c8
Containers:
samples-tf-mnist-demo:
Image: mcr.microsoft.com/azuredocs/samples-tf-mnist-demo:gpu
Port:
Host Port:
Args:
--max_steps
50000
Limits:
nvidia.com/gpu: 1
Requests:
nvidia.com/gpu: 1
Environment:
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-f7tj8 (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
kube-api-access-f7tj8:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
nvidia.com/gpu:NoSchedule op=Exists
pool-type=t4:NoSchedule
sku=gpu:NoSchedule
Events:
Type Reason Age From Message
Warning FailedScheduling 2m37s (x4 over 8m7s) default-scheduler 0/3 nodes are available: 3 Insufficient nvidia.com/gpu. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod..
(base) [email protected]:/Users/ottodeng/aks $
Steps to Reproduce the Problem
az aks create --name pix-nap-sg03 -l southeastasia
--resource-group pix-nap
--node-provisioning-mode Auto
--network-plugin-mode overlay
--network-dataplane cilium
--max-pods 110 --node-count 3
--service-cidr 10.97.96.0/20 --dns-service-ip 10.97.96.10
--network-plugin azure --enable-managed-identity
--node-osdisk-size 512 --node-vm-size Standard_D2s_v4
--vnet-subnet-id /subscriptions/2a5e7fa2-d528-4c66-9e6a-3a3a0f290e98/resourcegroups/pix-nap/providers/Microsoft.Network/virtualNetworks/aks-nap-vnet-junfeng/subnets/aks
--ssh-key-value ~/.ssh/id_rsa.pub
--tier premium --k8s-support-plan AKSLongTermSupport --kubernetes-version 1.27 --yes
Resource Specs and Logs
no any karpenter log.
Community Note
The text was updated successfully, but these errors were encountered: