Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paralyzed "sveltos-agent-manager" by "kubecost" deployment #1048

Open
Josca opened this issue Feb 6, 2025 · 0 comments
Open

Paralyzed "sveltos-agent-manager" by "kubecost" deployment #1048

Josca opened this issue Feb 6, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@Josca
Copy link
Contributor

Josca commented Feb 6, 2025

Describe the bug
I managed to "paralyze" my aws managed cluster when testing "kubecost" deployment. Here is my setup and result:

To Reproduce

  1. Cluster deployment (based on t3.medium and aws-standalone-cp-0-1-0):
apiVersion: k0rdent.mirantis.com/v1alpha1
kind: ClusterDeployment
metadata:
  name: aws-${CLUSTER_NAME_SUFFIX}
  namespace: ${NAMESPACE}
spec:
  template: aws-standalone-cp-0-1-0
  credential: aws-cluster-identity-cred
  config:
    clusterLabels:
      k0rdent: demo
    controlPlane:
      instanceType: t3.medium
    controlPlaneNumber: 1
    publicIP: false
    region: ${AWS_REGION}
    worker:
      instanceType: t3.medium
    workersNumber: 1
  1. Install kubecost service template from k0rdent catalog.

  2. Create MultiClusterService:

apiVersion: k0rdent.mirantis.com/v1alpha1
kind: MultiClusterService
metadata:
  name: demo
spec:
  clusterSelector:
    matchLabels:
      k0rdent: demo
  serviceSpec:
    services:
      - template: ingress-nginx-4-11-3
        name: ingress-nginx
        namespace: ingress-nginx
        values: |
          ingress-nginx:
            controller:
              hostPort:
                enable: true
      - template: kubecost-2-5-3
        name: kubecost
        namespace: kubecost
        values: |
          cost-analyzer:
            kubecostToken: "<obtain-free-plan-kubecost-token>"
            ingress:
              enabled: true
              className: nginx
              hosts: ['*']
  • Get free plan kubecost token here.
  1. Check managed cluster:
# export managed cluster kubeconfig:
kubectl get secret aws-${CLUSTER_NAME_SUFFIX}-kubeconfig -o=jsonpath={.data.value} | base64 -d > kubeconfig1
# get managed cluster pods:
KUBECONFIG="./kubeconfig1" kubectl get pod -A

Output (wall of Evicted and ContainerStatusUnknown pods, different namespaces), management services impacted:

ingress-nginx    ingress-nginx-controller-cbcf8bf58-rtfh8    1/1     Running                  0             25m
ingress-nginx    ingress-nginx-controller-cbcf8bf58-tjp4v    0/1     ContainerStatusUnknown   1             27m
kube-system      aws-cloud-controller-manager-6g5q5          1/1     Running                  0             31m
kube-system      calico-kube-controllers-6cd7d8cc9f-4wmlp    1/1     Running                  0             32m
kube-system      calico-node-fpdr6                           1/1     Running                  0             32m
kube-system      calico-node-mp29q                           1/1     Running                  0             31m
kube-system      coredns-645c5d6f5b-2dkkl                    0/1     ContainerStatusUnknown   1             31m
kube-system      coredns-645c5d6f5b-bxf6t                    1/1     Running                  0             24m
kube-system      coredns-645c5d6f5b-nr5kr                    1/1     Running                  0             31m
kube-system      ebs-csi-controller-977d5cc56-22q85          5/5     Running                  0             32m
kube-system      ebs-csi-controller-977d5cc56-zmcvq          5/5     Running                  0             32m
kube-system      ebs-csi-node-bqlfd                          3/3     Running                  0             32m
kube-system      ebs-csi-node-flfnc                          3/3     Running                  0             31m
kube-system      kube-proxy-bwvx4                            1/1     Running                  0             32m
kube-system      kube-proxy-tjs9n                            1/1     Running                  0             31m
kube-system      metrics-server-78c4ccbc7f-wv6x9             1/1     Running                  0             32m
kubecost         kubecost-cost-analyzer-6dd7745d98-42r2q     0/4     Pending                  0             61s
kubecost         kubecost-cost-analyzer-6dd7745d98-ffcbh     0/4     Evicted                  0             12m
kubecost         kubecost-cost-analyzer-6dd7745d98-fpgn9     0/4     ContainerStatusUnknown   4 (18m ago)   25m
kubecost         kubecost-cost-analyzer-6dd7745d98-klggs     0/4     Evicted                  0             12m
kubecost         kubecost-cost-analyzer-6dd7745d98-ktzkk     0/4     Evicted                  0             12m
kubecost         kubecost-forecasting-65775fc4d5-67994       1/1     Running                  0             24m
kubecost         kubecost-forecasting-65775fc4d5-zg7q8       0/1     ContainerStatusUnknown   1             26m
kubecost         kubecost-grafana-84c4b4bb4c-2fm7k           0/2     Evicted                  0             6m31s
kubecost         kubecost-grafana-84c4b4bb4c-dx9g4           0/2     Completed                0             6m30s
kubecost         kubecost-grafana-84c4b4bb4c-k5mr5           0/2     Evicted                  0             6m31s
kubecost         kubecost-grafana-84c4b4bb4c-kmbx7           0/2     Evicted                  0             17m
kubecost         kubecost-grafana-84c4b4bb4c-knvvx           0/2     ContainerStatusUnknown   2             26m
kubecost         kubecost-grafana-84c4b4bb4c-ktlvl           0/2     Evicted                  0             6m31s
kubecost         kubecost-grafana-84c4b4bb4c-r4wmm           0/2     Evicted                  0             17m
kubecost         kubecost-grafana-84c4b4bb4c-rlk65           0/2     ContainerStatusUnknown   2             17m
kubecost         kubecost-grafana-84c4b4bb4c-smv9r           0/2     Evicted                  0             17m
kubecost         kubecost-grafana-84c4b4bb4c-vzfgm           0/2     Evicted                  0             6m31s
...
kubecost         kubecost-grafana-84c4b4bb4c-zvgjc           0/2     Pending                  0             63s
kubecost         kubecost-prometheus-server-db457d69-bj2sh   1/1     Running                  0             25m
kubecost         kubecost-prometheus-server-db457d69-x9kpn   0/1     ContainerStatusUnknown   1             26m
projectsveltos   sveltos-agent-manager-7977cd77f-2l4n9       0/1     Evicted                  0             25m
projectsveltos   sveltos-agent-manager-7977cd77f-2r87z       0/1     Evicted                  0             25m
projectsveltos   sveltos-agent-manager-7977cd77f-2tmnx       0/1     ContainerStatusUnknown   1             31m
projectsveltos   sveltos-agent-manager-7977cd77f-457n4       0/1     Evicted                  0             25m
projectsveltos   sveltos-agent-manager-7977cd77f-4m6ls       0/1     Evicted                  0             25m
...
projectsveltos   sveltos-agent-manager-7977cd77f-xsg2j       1/1     Running                  0             25m

Expected behavior
Only kubecost deployment has errors due to lack of resources. Sveltos agent is not impacted, kube-system pods are not impacted.

Note: Tested with k0rdent 0.1.0 release

@Josca Josca added the bug Something isn't working label Feb 6, 2025
@github-project-automation github-project-automation bot moved this to Todo in k0rdent Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Todo
Development

No branches or pull requests

1 participant