Skip to content

Latest commit

 

History

History
89 lines (74 loc) · 3.95 KB

File metadata and controls

89 lines (74 loc) · 3.95 KB

Node Problem Detector Add-on

node-problem-detector makes common Kubernetes node problems visible to the cluster management stack. It is a daemon which runs on each node, detects problems, and reports them to the apiserver.

For more details on the Kubernetes node-problem-detector, see the GitHub project.

The following is a sample API definition with the node-problem-detector addon enabled.

{
  "apiVersion": "vlabs",
  "properties": {
    "orchestratorProfile": {
      "orchestratorType": "Kubernetes",
      "kubernetesConfig": {
        "addons": [
          {
            "name": "node-problem-detector",
            "enabled": true
          }
        ]
      }
    },
    "masterProfile": {
      "count": 1,
      "dnsPrefix": "",
      "vmSize": "Standard_DS2_v2"
    },
    "agentPoolProfiles": [
      {
        "name": "agentpool",
        "count": 3,
        "vmSize": "Standard_DS2_v2",
        "availabilityProfile": "AvailabilitySet"
      }
    ],
    "linuxProfile": {
      "adminUsername": "azureuser",
      "ssh": {
        "publicKeys": [
          {
            "keyData": ""
          }
        ]
      }
    },
    "servicePrincipalProfile": {
      "clientId": "",
      "secret": ""
    }
  }
}

You can validate that the add-on is running as expected with the following command. You should see a node-problem-detector pod running for each agent node in the cluster.

kubectl get pods -n kube-system

To test node-problem-detector in a running cluster, you can inject messages into the logs it is watching. See Try It Out in the node-problem-detector documentation for details.

Configuration

Name Required Description Default Value
customPluginMonitor no Comma-separated list of custom plugin monitor config files /config/kernel-monitor-counter.json,/config/systemd-monitor-counter.json
systemLogMonitor no Comma-separated list of system log monitor config files /config/kernel-monitor.json,/config/docker-monitor.json,/config/systemd-monitor.json
systemStatsMonitor no Comma-separated list of system stats monitor config files /config/system-stats-monitor.json
versionLabel no Version label used as DaemonSet selector v0.8.4

Node Problem Detector

Name Required Description Default Value
name no container name "node-problem-detector"
image no image "k8s.gcr.io/node-problem-detector:v0.8.1"
cpuRequests no cpu requests for the container "20m"
memoryRequests no memory requests for the container "20Mi"
cpuLimits no cpu limits for the container "200m"
memoryLimits no memory limits for the container "100Mi"

Supported Orchestrators

Kubernetes

Contact