EKS is the chosen implementation of Kubernetes on AWS's elife account.
- Install
kubectl
- Do not install
aws-iam-authenticator
: it's no longer necessary in recent versions of theaws
CLI
Make sure your AWS user is either:
- added to the
KubernetesAdministrators
group - allowed to execute
eks:describeCluster
, andsts:assumeRole
onarn:aws:iam::512686554592:role/kubernetes-aws--test--AmazonEKSUserRole
where kubernetes-aws--test
is the name of the stack containing the cluster.
Execute:
$ aws eks update-kubeconfig --name=kubernetes-aws--test --role-arn arn:aws:iam::512686554592:role/kubernetes-aws--test--AmazonEKSUserRole
This will write a new configuration in ~/.kube/config
.
You can now execute:
$ kubectl version
$ kubectl get nodes
A project can be configured to create a cluster with the eks
configuration in elife.yaml
.
kubernetes-aws:
description: project managing an EKS cluster
domain: False
intdomain: False
aws:
ec2: false
aws-alt:
flux-prod:
ec2: false
eks:
version: 1.16
worker:
type: t2.large
max-size: 2
desired-capacity: 2
./bldr launch:kubernetes-aws,flux-prod # to create, note elife issue #5928
./bldr update_infrastructure:kubernetes-aws--flux-prod # to update/change
updating the version of eks will upgrade the EKS managed controlplane, the launch configuration's AMI image and any addons to the latest supported version. Once that value is updated, run an update like so:
./bldr update_infrastructure:kubernetes-aws--flux-prod
./bldr update_infrastructure:kubernetes-aws--flux-prod
Note: AWS API will not return the correct addon versions until the cluster has been upgraded. Therefore, to fully upgrade requires two runs: one to upgrade EKS, and one to then upgrade the addons
Note: it is useful to run the update_infrastructure
command once before starting an upgrade, to make sure the cluster AMI and addons are up to date for the previous EKS version. This way, the upgrade command for the new EKS is easier to see what has changed the cluster version upgrade.
A cluster cannot be deleted as-is as its operations create cloud resources that become dependent upon the cluster resources, or would become leftovers if not deleted at the right level of abstraction.
Checklist to go through before destruction:
- delete all Helm releases (should take care of DNS entries from
Service
instances) - scale worker nodes down to 0 (untested but should take care of ENI dependent on security groups)
- delete ELBs that haven't been deleted yet (not sure if necessary)
- delete security groups that haven't been deleted yet (not sure if necessary)
builder generates Terraform templates that describe the set of EKS and EC2 resources that are created inside an eks
-enabling stack.
./bldr update_infrastructure:kubernetes-aws--test # will generate Terraform templates, they should have no change to apply
cat .cfn/terraform/kubernetes-aws--test.json | jq .
helm ls
kubectl get pods --all-namespaces
Workers are managed through an autoscaling group. When the AMI of the worker is updated, only newly created EC2 instances will use it; existing ones won't be deleted.
The best option to update the AMI is to cordon off servers, drain them and delete them so that they are recreated by the autoscaling group. This is not implemented in builder
, but can be achieved with the commands:
kubectl get nodes #
kubectl cordon my-node # no new Pods will be scheduled here
kubectl drain my-node # existing Pods will be evicted and sent to another noe
aws ec2 terminate-instances --instance-ids=... # terminate a node, a new one will be created