We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
My setup is the following:
$ kubectl logs -n anteon alaz-daemonset-sskqx {"level":"info","tag":"v0.11.3","time":1723187890,"message":"alaz tag"} {"level":"info","time":1723187890,"message":"k8sCollector initializing..."} {"level":"info","time":1723187890,"message":"Connected successfully to CRI using endpoint unix:///proc/1/root/run/containerd/containerd.sock"} panic: runtime error: integer divide by zero goroutine 47 [running]: github.com/ddosify/alaz/aggregator.(*ClusterInfo).handleSocketMapCreation(0xc0002dc5b0) /app/aggregator/cluster.go:89 +0x33d created by github.com/ddosify/alaz/aggregator.newClusterInfo in goroutine 1 /app/aggregator/cluster.go:59 +0x1a9
Name: alaz-daemonset-sskqx Namespace: anteon Priority: 0 Service Account: alaz-serviceaccount Node: thinkpad/192.168.1.38 Start Time: Fri, 09 Aug 2024 10:01:44 +0300 Labels: app=alaz controller-revision-hash=6f9d87bfc4 pod-template-generation=1 Annotations: cni.projectcalico.org/containerID: 003a6554ea84ff581daee5b353ccf9b6619a8febdb6302ce34a566764f0e45f3 cni.projectcalico.org/podIP: 10.1.19.183/32 cni.projectcalico.org/podIPs: 10.1.19.183/32 Status: Running IP: 10.1.19.183 IPs: IP: 10.1.19.183 Controlled By: DaemonSet/alaz-daemonset Containers: alaz-pod: Container ID: containerd://c6c904add2264b0016798d11550f2ff05e683fe713c681c3f3a415e31de9f07c Image: ddosify/alaz:v0.11.3 Image ID: docker.io/ddosify/alaz@sha256:08dbbb8ba337ce340a8ba8800e710ff5a2df9612ea258cdc472867ea0bb97224 Port: 8181/TCP Host Port: 0/TCP Args: --no-collector.wifi --no-collector.hwmon --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/pods/.+)($|/) --collector.netclass.ignored-devices=^(veth.*)$ State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 2 Started: Fri, 09 Aug 2024 10:18:10 +0300 Finished: Fri, 09 Aug 2024 10:18:11 +0300 Ready: False Restart Count: 8 Limits: memory: 1Gi Requests: cpu: 1 memory: 400Mi Environment: TRACING_ENABLED: true METRICS_ENABLED: true LOGS_ENABLED: false BACKEND_HOST: http://bore.pub:39548/api-alaz LOG_LEVEL: 1 MONITORING_ID: 7c6a484a-ec47-46a6-946d-4071ff6cf883 SEND_ALIVE_TCP_CONNECTIONS: false NODE_NAME: (v1:spec.nodeName) Mounts: /sys/kernel/debug from debugfs (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-df6xh (ro) Conditions: Type Status PodReadyToStartContainers True Initialized True Ready False ContainersReady False PodScheduled True Volumes: debugfs: Type: HostPath (bare host directory volume) Path: /sys/kernel/debug HostPathType: kube-api-access-df6xh: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true QoS Class: Burstable Node-Selectors: <none> Tolerations: node.kubernetes.io/disk-pressure:NoSchedule op=Exists node.kubernetes.io/memory-pressure:NoSchedule op=Exists node.kubernetes.io/not-ready:NoExecute op=Exists node.kubernetes.io/pid-pressure:NoSchedule op=Exists node.kubernetes.io/unreachable:NoExecute op=Exists node.kubernetes.io/unschedulable:NoSchedule op=Exists Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning BackOff 3m54s (x68 over 18m) kubelet Back-off restarting failed container alaz-pod in pod alaz-daemonset-sskqx_anteon(a3d74951-574e-4149-8db3-9749a627f5fd)
apiVersion: v1 kind: ServiceAccount metadata: name: alaz-serviceaccount namespace: anteon --- # For alaz to keep track of changes in cluster apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: alaz-role namespace: anteon rules: - apiGroups: - "*" resources: - pods - services - endpoints - replicasets - deployments - daemonsets - statefulsets verbs: - "get" - "list" - "watch" --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: alaz-role-binding namespace: anteon roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: alaz-role subjects: - kind: ServiceAccount name: alaz-serviceaccount namespace: anteon --- apiVersion: apps/v1 kind: DaemonSet metadata: name: alaz-daemonset namespace: anteon spec: selector: matchLabels: app: alaz template: metadata: labels: app: alaz spec: hostPID: true containers: - env: - name: TRACING_ENABLED value: "true" - name: METRICS_ENABLED value: "true" - name: LOGS_ENABLED value: "false" - name: BACKEND_HOST value: http://bore.pub:39548/api-alaz - name: LOG_LEVEL value: "1" # - name: EXCLUDE_NAMESPACES # value: "^anteon.*" - name: MONITORING_ID value: 7c6a484a-ec47-46a6-946d-4071ff6cf883 - name: SEND_ALIVE_TCP_CONNECTIONS # Send undetected protocol connections (unknown connections) value: "false" - name: NODE_NAME valueFrom: fieldRef: apiVersion: v1 fieldPath: spec.nodeName args: - --no-collector.wifi - --no-collector.hwmon - --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/pods/.+)($|/) - --collector.netclass.ignored-devices=^(veth.*)$ image: ddosify/alaz:v0.11.3 imagePullPolicy: IfNotPresent name: alaz-pod ports: - containerPort: 8181 protocol: TCP resources: limits: memory: 1Gi requests: cpu: "1" memory: 400Mi securityContext: privileged: true terminationMessagePath: /dev/termination-log terminationMessagePolicy: File # needed for linking ebpf trace programs volumeMounts: - mountPath: /sys/kernel/debug name: debugfs readOnly: false dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler securityContext: {} serviceAccount: alaz-serviceaccount serviceAccountName: alaz-serviceaccount terminationGracePeriodSeconds: 30 # needed for linking ebpf trace programs volumes: - name: debugfs hostPath: path: /sys/kernel/debug
Only thing that I did different compared to the documentation was using bore.pub instead of ngrok which shouldn't be a problem I think.
ngrok
I'm running Arch Linux with the kernel 6.10.1-arch1-1.
6.10.1-arch1-1
The text was updated successfully, but these errors were encountered:
I'm getting the same issue when I deploy via Helm chart as well:
{"level":"info","tag":"v0.12.0","time":1723477886,"message":"alaz tag"} {"level":"info","time":1723477886,"message":"k8sCollector initializing..."} {"level":"info","time":1723477886,"message":"Connected successfully to CRI using endpoint unix:///proc/1/root/run/containerd/containerd.sock"} {"level":"error","time":1723477887,"message":"error creating gpu collector: failed to load nvidia driver: <nil>"} {"level":"error","time":1723477887,"message":"error exporting gpu metrics: failed to load nvidia driver: <nil>"} panic: runtime error: integer divide by zero goroutine 85 [running]: github.com/ddosify/alaz/aggregator.(*ClusterInfo).handleSocketMapCreation(0xc0002fcd90) /app/aggregator/cluster.go:89 +0x33d created by github.com/ddosify/alaz/aggregator.newClusterInfo in goroutine 1 /app/aggregator/cluster.go:59 +0x1a9
Sorry, something went wrong.
I guess there is a race condition on this line:
alaz/aggregator/cluster.go
Line 89 in 2f383f1
No branches or pull requests
My setup is the following:
kubectl describe pod -n anteon alaz-daemonset-sskqx
alaz.yaml
Only thing that I did different compared to the documentation was using bore.pub instead of
ngrok
which shouldn't be a problem I think.I'm running Arch Linux with the kernel
6.10.1-arch1-1
.The text was updated successfully, but these errors were encountered: