Sak Karpenter provisioner module

Terraform module which creates Karpenter provisioner and karpenter AwsNodeTemplate resources.

External Documentation

Karpenter Provisioner
Karpenter Node Templates

Usage

ArgoCd enabled

Creates an argoCD Apllication and manifest files in argoCD path using local_file terraform resource.

module “provisioner” {
  source = “github.com/provectus/sak-karpenter-provisioner”
  cluster_name = var.cluster_name
  argocd_enabled = true
  argocd = module.argocd.state
  provisioners = {
    default = {
      requirements = [
        {
          key      = “karpenter.k8s.aws/instance-family”
          operator = “In”
          values   = [ “m5” ]
        },
        {
          key      = “karpenter.sh/capacity-type”
          operator = “In”
          values   = [“spot”, “on-demand”]
        },
        {
          key      = “karpenter.k8s.aws/instance-size”
          operator = “In”
          values   = [ “nano”, “micro”, “small”, “large”, “medium” ]
        },
      ]
      labels = {
        test = “true”
      }
      container_runtime = “containerd”
      consolidation_enabled = true
    },
  }
  depends_on = [
    helm_release.karpenter
  ]
}

ArgoCd disabled

Creates the manifests and apply them using kubectl_manifest terraform resource

module “provisioner” {
  source = “github.com/provectus/sak-karpenter-provisioner”
  cluster_name = var.cluster_name
  argocd_enabled = false
  provisioners = {
    default = {
      requirements = [
        {
          key      = “karpenter.k8s.aws/instance-family”
          operator = “In”
          values   = [ “m5” ]
        },
        {
          key      = “karpenter.sh/capacity-type”
          operator = “In”
          values   = [“spot”, “on-demand”]
        },
        {
          key      = “karpenter.k8s.aws/instance-size”
          operator = “In”
          values   = [ “nano”, “micro”, “small”, “large”, “medium” ]
        },
      ]
      labels = {
        test = “true”
      }
      container_runtime = “containerd”
      consolidation_enabled = true
    },
  }
  depends_on = [
    helm_release.karpenter
  ]
}

Basic examples with diffrent NodeGroup types

Following examples will help you to set needed configuration for your environment:

General purpose instances

General purpose instances provide a balance of compute, memory, and networking resources, and can be used for a wide range of workloads.You can use instance family from this list but it also should be in Instance types suported by Karpenter.

Example

provisioners = {
  general = {
    requirements = [
      {
        key      = “karpenter.k8s.aws/instance-family”
        operator = “In”
        values   = [ “m5”,"m4","t3"]
      },
    ]
    labels = {
      workflow-type = “general”
    }
    container_runtime = “containerd”
    consolidation_enabled = true
  },
}

Compute optimized instances

Compute optimized instances are ideal for compute-bound applications that benefit from high-performance processors.You can use instance family from this list but it also should be in Instance types suported by Karpenter.

Example

provisioners = {
  cpu-optimized = {
    requirements = [
      {
        key      = “karpenter.k8s.aws/instance-family”
        operator = “In”
        values   = [ “c5”,"c6i"]
      },
    ]
    labels = {
      workflow-type = cpu-optimized
    }
    container_runtime = “containerd”
    consolidation_enabled = true
  },
}

Memory optimized instances

Memory optimized instances are designed to deliver fast performance for workloads that process large data sets in memory.You can use instance family from this list but it also should be in Instance types suported by Karpenter.

Example

provisioners = {
  memory-optimized = {
    requirements = [
      {
        key      = “karpenter.k8s.aws/instance-family”
        operator = “In”
        values   = [ “r5”,"r6a"]
      },
    ]
    labels = {
      workflow-type = memory-optimized
    }
    container_runtime = “containerd”
    consolidation_enabled = true
  },
}

Accelerated computing instances(GPU optimized)

If you require high processing capability, you'll benefit from using accelerated computing instances, which provide access to hardware-based compute accelerators such as Graphics Processing Units (GPUs), Field Programmable Gate Arrays (FPGAs), or AWS Inferentia.

GPU optimized instance types

An instance with an attached NVIDIA GPU, such as a P3 or G4dn instance, must have the appropriate NVIDIA driver installed. Depending on the instance type, you can either download a public NVIDIA driver, download a driver from Amazon S3 that is available only to AWS customers, or use an AMI with the driver pre-installed.

Available drivers by instance type
AMIs with the NVIDIA drivers installed
Public NVIDIA drivers
GRID drivers (G5, G4dn, and G3 instances)
NVIDIA gaming drivers (G5 and G4dn instances)

An instance with an attached AMD GPU, such as a G4ad instance, must have the appropriate AMD driver installed. Depending on your requirements, you can either use an AMI with the driver preinstalled or download a driver from Amazon S3.

AMIs with the AMD driver installed
AMD driver download and install

Also you need to deploy an appropriate GPU device plugin daemonset for those nodes. Without the daemonset running, Karpenter will not see those nodes as initialized. Refer to general Kubernetes GPU docs and the following specific GPU docs:

nvidia.com/gpu: NVIDIA device plugin for Kubernetes
amd.com/gpu: AMD GPU device plugin for Kubernetes
aws.amazon.com/neuron: Kubernetes environment setup for Neuron
habana.ai/gaudi: Habana device plugin for Kubernetes

Example

provisioners = {
  gpu-optimized = {
    requirements = [
      {
        key      = “karpenter.k8s.aws/instance-family”
        operator = “In”
        values   = [ “g5"]
      },
    ]
    labels = {
      workflow-type = gpu-optimized
    }
    taints = [
      {
        key = "nvidia.com/gpu"
        value = true
        effect = "NoSchedule"
      }
    ]
    resources_limits = {
      cpu = 1000 
      memory = 1000Gi
      "nvidia.com/gpu" = 2
    }
    container_runtime = “containerd”
    consolidation_enabled = true
  },
}

Requirements

Name	Version
terraform	>= 1.0
local	>= 2.2.3
gavinbunney/kubectl	>= 1.14

Providers

Name	Version
local	>= 2.10
kubectl	>= 1.14

Modules

Name	Source	Version
provisioners	./modules/provisioners	n/a

Resources

Name	Type
local_file.provisioner_app	resource

Inputs

Name	Description	Type	Default	Required
cluster_name	A name of the Amazon EKS cluster	`string`	`null`	yes
argocd_enabled	A name of the Amazon EKS cluster	`bool`	`true`	no
application_name	A name of the Argocd application recource	`string`	`"provisioners"`	yes (if argocd_enabled = true)
argocd	A set of values for enabling deployment through ArgoCD	`map(string)`	`null`	yes (if argocd_enabled = true)
provisioners	Map of provisioner definitions to create	`any`	`{}`	yes

Provisioner Inputs

requirements
- Karpenter supports AWS-specific labels and Kubernetes Well-Known labels, for more advanced scheduling.These well-known labels may be specified at the provisioner level, or in a workload definition (e.g., nodeSelector on a pod.spec). Nodes are chosen using both the provisioner’s and pod’s requirements. If there is no overlap, nodes will not be launched. In other words, a pod’s requirements must be within the provisioner’s requirements. If a requirement is not defined for a well known label, any value available to the cloud provider may be chosen.
```
requirements = [
      {
        key      = “karpenter.k8s.aws/instance-family”
        operator = “In”
        values   = [ “m5” ]
      },
      {
        key      = “karpenter.sh/capacity-type”
        operator = “In”
        values   = [“spot”, “on-demand”]
      }
    ]
```
taints
- Provisioned nodes will have these taints.Taints may prevent pods from scheduling if they are not tolerated by the pod.
```
taints = [
  {
    key = "example.com/special-taint"
    effect = "NoSchedule"
  },
]
```
startup_taints
- StartupTaints are taints that are applied to nodes upon startup which are expected to be removed automatically within a short period of time, typically by a DaemonSet that tolerates the taint. These are commonly used by daemonsets to allow initialization and enforce startup ordering. StartupTaints are ignored for provisioning purposes in that pods are not required to tolerate a StartupTaint in order to have nodes provisioned for them.
```
startup_taints = [
  {
    key = "example.com/special-taint"
    effect = "NoSchedule"
  },
]
```
labels
- Labels are arbitrary key-values that are applied to all nodes.
```
labels = {
  billing-team = my-team
}
```
annotations
- Annotations are arbitrary key-values that are applied to all nodes.
```
annotations = {
  example.com/owner = "my-team"
}
```

resources_limits

constrains the maximum amount of resources that the provisioner will manage.

resources_limits = {
  cpu = 1000 
  memory = 1000Gi
  "nvidia.com/gpu" = 2
}

consolidation_enabled
- You can configure Karpenter to deprovision instances through your Provisioner in multiple ways. You can use ttl_seconds_after_empty, spec.ttl_seconds_until_expired or consolidation_enabled.
```
consolidation_enabled = true
```
ttl_seconds_after_empty
- If omitted, the feature is disabled, nodes will never scale down due to low utilization
```
ttl_seconds_after_empty = 30
```
ttl_seconds_until_expired
- If omitted, the feature is disabled and nodes will never expire. If set to less time than it requires for a node to become ready, the node may expire before any pods successfully start.
```
ttl_seconds_until_expired = 2592000 # 30 Days = 60 * 60 * 24 * 30 Seconds;
```
weight
- Priority given to the provisioner when the scheduler considers which provisioner to select. Higher weights indicate higher priority when comparing provisioners.Specifying no weight is equivalent to specifying a weight of 0.
```
weight = 10
```
container_runtime
- You can specify the container runtime to be either dockerd or containerd.containerd is the only valid container runtime when using the Bottlerocket AMI Family or when using the AL2 AMI Family and K8s version 1.24+
```
container_runtime = containerd
```
cluster_dns
- You can specify the container runtime to be either dockerd or containerd.containerd is the only valid container runtime when using the Bottlerocket AMI Family or when using the AL2 AMI Family and K8s version 1.24+
```
cluster_dns = ["10.0.1.100"]
```

kubelet_system_reserved

Override the --system-reserved configuration

kubelet_system_reserved = {
    cpu = "100m"
    memory = "100Mi"
    ephemeral-storage = "1Gi"
}

kubelet_kube_reserved

Override the --kube-reserved configuration

kubelet_kube_reserved = {
    cpu = "200m"
    memory = "100Mi"
    ephemeral-storage = "3Gi"
}

kubelet_eviction_hard

A hard eviction threshold has no grace period. When a hard eviction threshold is met, the kubelet kills pods immediately without graceful termination to reclaim the starved resource.Supported Eviction Signals

kubelet_eviction_hard = {
  "memory.available = "500Mi"
  "nodefs.available = "10%"
  "nodefs.inodesFree = "10%"
  "imagefs.available" = "5%"
  "imagefs.inodesFree" = "5%"
  "pid.available" = "7%"
}

kubelet_eviction_soft
- A soft eviction threshold pairs an eviction threshold with a required administrator-specified grace period. The kubelet does not evict pods until the grace period is exceeded. The kubelet returns an error on startup if there is no specified grace period.Supported Eviction Signals
```
kubelet_eviction_hard = {
  "memory.available = "500Mi"
  "nodefs.available = "10%"
  "nodefs.inodesFree = "10%"
  "imagefs.available" = "5%"
  "imagefs.inodesFree" = "5%"
  "pid.available" = "7%"
}
```

kubelet_eviction_soft_grace_period

A set of eviction grace periods that define how long a soft eviction threshold must hold before triggering a Pod eviction.

kubelet_eviction_soft_grace_period = {
  "memory.available = "1m"
  "nodefs.available = "1m30s"
  "nodefs.inodesFree = "2m"
  "imagefs.available" = "1m30s"
  "imagefs.inodesFree" = "2m"
  "pid.available" = "2m"
}

kubelet_eviction_max_pod_grace_period
- The administrator-specified maximum pod termination grace period to use during soft eviction.
```
kubelet_eviction_max_grace_period = "3m"
```
kubelet_max_pods
- This value will be used during Karpenter pod scheduling and passed through to --max-pods on kubelet startup.
```
kubelet_max_pods = "20"
```
kubelet_pods_per_core
- This value will also be passed through to the --pods-per-core value on kubelet startup to configure the number of allocatable pods the kubelet can assign to the node instance.
```
kubelet_pods_per_core = "2"
```
ami_family
- Resolves a default ami and userdata. Currently, Karpenter supports amiFamily values AL2, Bottlerocket, Ubuntu and Custom. GPUs are only supported with AL2 and Bottlerocket.
```
ami_family = "AL2"
```

block_device_mappings

Used to control the Elastic Block Storage (EBS) volumes that Karpenter attaches to provisioned nodes.

block_device_mappings = [
  {
      deviceName = "/dev/xvda"
      ebs = {
          volumeSize = "100Gi"
          volumeType = "gp3"
          iops = 10000
          encrypted = true
          kmsKeyID = "arn:aws:kms:us-west-2:111122223333:key/1234abcd-12ab-34cd-56ef-1234567890ab"
          deleteOnTermination = true
          throughput = 125
          snapshotID = "snap-0123456789"
      }
  },
  ]

metadata_options

Control the exposure of Instance Metadata Service on EC2 Instances launched by this provisioner using a generated launch template.

metadata_options = {
  httpEndpoint = "enabled"
  httpProtocolIPv6 = "disabled"
  httpPutResponseHopLimit = 2
  httpTokens = "required"
}

instace_profile
- An InstanceProfile is a way to pass a single IAM role to EC2 instance launched the provisioner. A default profile is configured in global settings, but may be overridden here. The AWSNodeTemplate will not create an InstanceProfile automatically. The InstanceProfile must refer to a Role that has permission to connect to the cluster. Overrides the node's identity from global settings.
```
instace_profile = "MyInstanceProfile"
```
detailed_monitoring_enabled
- Enabling detailed monitoring on the node template controls the EC2 detailed monitoring feature. If you enable this option, the Amazon EC2 console displays monitoring graphs with a 1-minute period for the instances that Karpenter launches.
```
detailed_monitoring_enabled = true
```
tags
- Karpenter adds tags to all resources it creates, including EC2 Instances, EBS volumes, and Launch Templates. The default set of AWS tags are listed below.The following tag ("karpenter.sh/discovery" = "YOUR_CLUSTER_NAME") is attached by default for Karpenter to be able to launche instances.
```
tags = {
  "InternalAccountingTag" = "1234"
  "dev.corp.net/app" = "Calculator"
  "dev.corp.net/team" = "MyTeam"
}
```
subnet_selector
- Discovers subnets using AWS tags. Subnets may be specified by any AWS tag, including Name. Selecting tag values using wildcards (*) is supported. Subnet IDs may be specified by using the key aws-ids and then passing the IDs as a comma-separated string value. When launching nodes, a subnet is automatically chosen that matches the desired zone. If multiple subnets exist for a zone, the one with the most available IP addresses will be used.
```
subnet_selector = {
  "Name" = "*Public*"
  "MyTag" = "" # matches all resources with the tag
  "aws-ids" = "subnet-09fa4a0a8f233a921,subnet-0471ca205b8a129ae"
}
```

sg_selector

Security groups may be specified by any AWS tag, including “Name”. Selecting tags using wildcards (*) is supported.

sg_selector = {
  "Name" = "*Public*"
  "MyTag" = "" # matches all resources with the tag
  "aws-ids" = "sg-063d7acfb4b06c82c,sg-06e0cf9c198874591"
}

ami_selector
- ami_selector is used to configure custom AMIs for Karpenter to use, where the AMIs are discovered through AWS tags, similar to subnetSelector. This field is optional, and Karpenter will use the latest EKS-optimized AMIs if an ami_selector is not specified.
```
ami_selector = {
  "Name" = "*Public*"
  "MyTag" = "" # matches all resources with the tag
  "aws-ids" = "ami-123,ami-456"
}
```

user_data

You can control the UserData that is applied to your worker nodes via this field.

user_data = <<EOF
#!/bin/bash
  mkdir -p ~ec2-user/.ssh/
  touch ~ec2-user/.ssh/authorized_keys
  cat >> ~ec2-user/.ssh/authorized_keys <<EOF
  {{ insertFile "../my-authorized_keys" | indent 4  }}
  EOF
  chmod -R go-w ~ec2-user/.ssh/authorized_keys
  chown -R ec2-user ~ec2-user/.ssh
EOF

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Sak Karpenter provisioner module

External Documentation

Usage

ArgoCd enabled

ArgoCd disabled

Basic examples with diffrent NodeGroup types

General purpose instances

Compute optimized instances

Memory optimized instances

Accelerated computing instances(GPU optimized)

Requirements

Providers

Modules

Resources

Inputs

Provisioner Inputs

Files

README.md

Latest commit

History

README.md

File metadata and controls

Sak Karpenter provisioner module

External Documentation

Usage

ArgoCd enabled

ArgoCd disabled

Basic examples with diffrent NodeGroup types

General purpose instances

Compute optimized instances

Memory optimized instances

Accelerated computing instances(GPU optimized)

Requirements

Providers

Modules

Resources

Inputs

Provisioner Inputs