The Kubernetes Book

Book by Nigel Poulton, https://github.com/nigelpoulton/TheK8sBook

1: Kubernetes primer
2: Kubernetes principles of operation
3: Getting Kubernetes
4: Working with Pods
5: Virtual clusters with Namespaces
6: Kubernetes Deployments
7: Kubernetes Services
8: Ingress
9: Service discovery deep dive
10: Kubernetes storage
11: ConfigMaps and Secrets
12: StatefulSets
13: API security and RBAC
14: The Kubernetes API
15: Threat modeling Kubernetes

1: Kubernetes primer

Kubernetes - an application orchestrator, it orchestrates containerized cloud-native microservices apps.

orchestrator - a system that deploys and manages applications (dynamically respond to changes - scale up/down, self-heal, perform zero-downtime rolling updates)
containerized app - app that runs in a container - 1980-1990 physical servers era, 2000-2010 virtual machines and virtualization era, now cloud-native era
cloud-native app - designed to meet cloud-like demands of auto-scaling, self-healing, rolling updates, rollbacks and more, cloud-native is about the way applications behave and react to events
microservices app - built from lots of small, specialised, independent parts that work together to form a meaningful application

Kubernetes enables 2 things Google and the rest of the industry needs:

It abstracts underlying infrastructure such as AWS
It makes it easy to move applications on and off clouds

Kubernetes vs Docker Swarm - long story short, Kubernetes won. Docker Swarm is still under active development and is popular with small companies that need simple alternative to Kubernetes.

Kubernetes as the operating system of the cloud:

you install a traditional OS on a server, and it abstracts server resources and schedules application processes
you install Kubernetes on a cloud, and it abstracts cloud resources and schedules application microservices

At a high level, a cloud/datacenter is a pool of compute, network and storage resources. Kubernetes abstracts them. Servers are no longer pets, they are cattle.

Kubernetes is like a courier service - you package the app as a container, give it a Kubernetes manifest, and let Kubernetes take care of deploying it and keeping it running.

2: Kubernetes principles of operation

Kubernetes is 2 things:

a cluster to run applications on
- like any cluster - bunch od machines to host apps
- these machines are called "nodes" (physical servers, VMs, cloud instances, Raspberry PIs, ...)
- cluster is made of:
  - control plane (the brains) - exposes the API, has a scheduler for assigning work, records the state of the cluster and apps
  - worker nodes (the muscle) - where user apps run
an orchestrator of cloud-native microservices apps
- a system that takes care of deploying and managing apps

Simple process to run apps on a Kubernetes cluster:

Design and write the application as small independent microservices
Package each microservice as its own container
Wrap each container in a Kubernetes Pod
Deploy Pods to the cluster via higher-level controllers such as Deployments, DaemonSets, StatefulSets, CronJobs, ...

The Control Plane - runs a collection of system services that make up the control plane of the cluster (Master, Heads, Head nodes). Production envs should have multiple control plane nodes - 3 or 5 recommended, and should be spread across availability zones. Different services making up the control plane:

The API server - the Grand Central station of Kubernetes, all communication, between all components, must go through the API server. All roads lead to the API Server.
The Cluster Store - the only stateful part of the Control Plane, stores the configuration and the state. Based on etcd (a popular distributed database).
The Controller Manager and Controllers - all the background controllers that monitor cluster components amd respond to events.
The Scheduler - watches the API server for new work tasks and assigns them to appropriate healthy worker nodes. Only responsible for picking the nodes to run tasks, it isn't responsible for running them.
The Cloud Controller Manager - its job is to facilitate integrations with cloud services, such as instances, load-balancers, and storage.

Worker nodes - are where user applications run. At a high-level they do 3 things:

Watch the API server for new work assignments
Execute work assignments
Report back to the control plane (via the API server)

3 major components:

Kubelet - main Kubernetes agent and runs on every worker node. Watches the API server for new work tasks. Executes the task and maintains reporting channel back to the control plane.
Container runtime - kubelet needs it to perform container-related tasks - things like pulling images and starting and stopping containers.
Kube-proxy - runs on every node and is responsible for local cluster networking.

In order to run on a Kubernetes cluster an application needs to:

Be packaged as a container
Be wrapped in a Pod
Be deployed via a declarative manifest file

The declarative model:

declare the desired state of an application microservice in a manifest file
- desired state - image, how many replicas, which network ports, how to perform updates
post it to the API server
- using kubectl CLI (it uses a HTTP request)
Kubernetes stores it in the cluster store as the application's desired state
Kubernetes implements the desired state on the cluster
A controller makes sure the observed state of the application doesn't vary from the desired state
- background reconciliation loops that constantly monitor the state of the cluster, if desired state != observed state - Kubernetes performs the necessary tasks

Kubernetes Pod - a wrapper that allows a container to run on a Kubernetes cluster. Atomic unit of scheduling. VMware has virtual machines, Docker has containers, Kubernetes has Pods. In Kubernetes, every container must run inside a Pod. " Pod" comes from "a pod of whales" (group of whales is called "a pod of whales"). "Pod" and "container" are often used interchangeably, however it is possible (in some advanced use-cases) to run multiple containers in a single Pod.

Pods don't run applications - applications always run in containers, the Pod is just a sandbox to run one or more containers. Pods are also the minimum unit of scheduling in Kubernetes. If you need to scale an app, you add or remove Pods. You do not scale by adding more containers to existing Pods.

A pod is only ready for service when all its containers are up and running. A single Pod can only be scheduled to a single node.

Pods are immutable. Whenever we talk about updating Pods, we mean - delete and replace it with a new one. Pods are unreliable.

Example controller: Deployments - a high-level Kubernetes object that wraps around a Pod and adds features such as self-healing, scaling, zero-downtime rollouts, and versioned rollbacks.

Services - provide reliable networking for a set of Pods. Services have a stable DNS name, IP address and name, they load-balance traffic across a dynamic set of Pods. As Pods come and go, the Service observes this, automatically updates itself, and continues to provide that stable networking endpoint.

Service - a stable network abstraction that provides TCP/UPD load-balancing across a dynamic set of Pods.

3: Getting Kubernetes

Hosted Kubernetes: AWS Elastic Kubernetes Service, Google Kubernetes Engine, Azure Kubernetes Service. Managing your own Kubernetes cluster isn't a good use of time and other resources. However, it is easy to rack up large bills if you forget to turn off infrastructure when not in use.

The hardest way to get a Kubernetes cluster is to build it yourself.

Play with Kubernetes - quick and simple way to get your hands on a development Kubernetes cluster. However, it is time limited and sometimes suffers from capacity and performance issues. Link: https://labs.play-with-k8s.com

Docker Desktop - offers a single-node Kubernetes cluster that you can develop and test with.

kubectl is the main Kubernetes command-line tool. At a high-level, kubectl converts user-friendly commands into HTTP REST requests with JSON content required by the Kubernetes API server.

kubectl get nodes

kubectl config current-context

kubectl config use-context docker-desktop

4: Working with Pods

Controllers - infuse Pods with super-powers such as self-healing, scaling, rollouts and rollbacks. Every Controller bas a PodTemplate defining the Pods it deploys and manages. You rarely interact with Pods directly.

Pod - the atomic unit of scheduling in Kubernetes. Apps deployed to Kubernetes always run inside Pods. If you deploy an app, you deploy it in a Pod. If you terminate an app, you terminate its Pod. If you scale your app up/down, you add/remove Pods.

Kubernetes doesn't allow containers to run directly on a cluster, they always have to be wrapped in a Pod.

Pods augment containers

labels - group Pods and associate them with others
annotations - add experimental features and integrations with 3rd-party tools
probes - test the health and status of Pods and the apps they run, this enables advanced scheduling, updates, and more.
affinity and anti-affinity rules - control over where in the cluster Pods are allowed to run
termination controls - gracefully terminate Pods and the apps they run
security policies - enforce security features
resource requests and limits - min. and max. values for CPU, memory, IO, ...

Despite bringing so many features, Pods are super-lightweight and add very little overhead.

kubectl explain pods --recursive

kubectl explain pod.spec.restartPolicy

Pods assist in scheduling

Every container in a Pods is guaranteed to be scheduled to the same worker node.

Pods enable resource sharing

Pods provide a shared execution environment for one or more containers (filesystem, network stack, memory, volumes). So if a Pod has 2 containers, both containers share the Pod's IP address and can access ony of the Pod's volumes to share data.

There are 2 ways to deploy a Pod:

directly via a Pod manifest
- called "Static Pods", no super-powers like self-healing, scaling, or rolling updates
indirectly via a controller
- have all the benefits of being monitored by a highly-available controller running on the control-plane

Pets vs Cattle paradigm - Pods are cattle, when they die, they get replaced by another. The old one is gone, and a shiny new one (with the same config, but a different IP and UID) magically appears and takes its place.

This is why applications should always store state and data outside the Pod. It is also why you should not rely on individual Pods - they are ephemeral, here today, gone tomorrow.

Deploying Pods:

Define it in a YAML manifest file
Post it to the API server
The API server authenticates and authorizes the request
The configuration (YAML) is validated
The scheduler deploys the Pod to a healthy worker node with enough available resources

If you are using Docker or containerd as your container runtime, a Pod is actually a special type of container - a pause container. This means containers running inside of Pods are really containers running inside containers.

The Pod Network is flat, meaning every Pod can talk directly to every other Pod without the need for complex routing and port mappings. You should use Kubernetes Network Policies.

Pod deployment is an atomic operation - all-or-nothing - deployment either succeeds or fails. You will never have a scenario where a partially deployed Pod is servicing requests.

Pod lifecycle: pending -> running (long-lived Pod) | succeeded (short-lived Pod)

short-lived - batch jobs, designed to only run until a task completes
long-lived - web-servers, remain in the running phase indefinitely, if container fail, the controller may attempt to restart them

Pods are immutable objects. You can't modify them after they are deployed. You always replace a Pod with a new one (in case of a failure or update).

If you need to scale an app, you add or remove Pods (horizontal scaling). You never scale an app by adding more of the same containers to a Pod. Multi-container Pods are only for co-scheduling and co-locating containers that need tight coupling.

Co-locating multiple containers in the same Pod allows containers to be designed with a single responsibility but co-operate closely with others.

Kubernetes multi-container Pod patterns:

Sidecar pattern - (most popular) the job of a sidecar is to augment of perform a secondary task for the main application container
Adapter pattern - variation of the sidecar pattern where the helper container takes non-standardized output from the main container and rejigs it into a format required by an external system
Ambassador pattern - variation of the sidecar pattern where the helper container brokers connectivity to an external system, ambassador containers interface with external systems on behalf of the main app container
Init pattern - runs a special init container that is guaranteed to start and complete before your main app container, it is also guaranteed to run only once

kubectl get pods

Get pods info with additional info:

kubectl get pods -o wide

Get pod info, a full copy of the Pod from the cluster:

kubectl get pods -o yaml

Get even more info; spec - desired state, status - observed state:

kubectl get pods hello-pod -o yaml

Pod manifest files:

kind - tells the Kubernetes the type of object being defined
apiVersion - defines the schema version to use when creating the object
metadata - names, labels, annotations, and a Namespace
spec - define the containers the Pod will run

kubectl apply -f pod.yml

kubectl describe - a nicely formatted multi-line overview of an object: You can add the --watch flag to the command to monitor it and see when the status changes to Running.

kubectl describe pods hello-pod

You can see ordering and names of containers using this command.

kubectl logs - like other Pod related commands, if you don't specify --container, it executes against the first container in the pod:

kubectl logs hello-pod

kubectl logs hello-pod --container hello-ctr

kubectl exec - execute commands inside a running Pod

kubectl exec hello-pod -- pwd

Get shell access:

kubectl exec -it hello-pod hello-pod -- sh

-it flag makes the session interactive and connects STDIN and STDOUT on your terminal to STD and STDOUT inside the first container in the Pod.

Pod hostname - every container in a Pod inherits its hostname from the name of the Pod (metadata.name). With this in mind, you should always set Pod names as valid DNS names (a-z, 0-9, +, -, .).

spec.initCointainers block defines one or more containers that Kubernetes guarantees will run and complete before main app container starts.

kubectl delete pod git-sync

5: Virtual clusters with Namespaces

Namespaces are a native way to divide a single Kubernetes cluster into multiple virtual clusters.

Namespaces partition a Kubernetes cluster and are designed as an easy way to apply quotas and policies to groups of objects.

See all Kubernetes API resources supported in your cluster:

kubectl api-resources

Namespaces are a good way of sharing a single cluster among different departments and environments. For example, a single cluster might have the following namespaces: dev, test, qa. Each one can have its own set of users and permissions, as well as unique resource quotas.

Namespaces are not good for isolating hostile workloads. A compromised container or Pod in one Namespace can wreak havoc in other Namespaces. For example, you shouldn't place competitors such as Pepsi and Coke, in separate Namespaces on the same shared cluster.

If you need strong workload isolation, the current method is to use multiple clusters. There are some attempts to do something different, but the safest and most common way of isolating workloads is putting them on their own clusters.

Every Kubernetes cluster has a set of pre-created Namespaces (virtual clusters):

kubectl get namespaces

default is where newly created objects go if you don't specify a Namespace
kube-system is where DNS, the metrics server, and other control plane components run
kube-public is for objects that need to be readable by anyone
kube-node-lease is used for node heartbeat and managing node leases

kubectl describe namespaces default

List service objects in a selected namespace:

kubectl get svc --namespace kube-system

kubectl get svc --all-namespaces

Create a new Namespace, Pods don't create a Namespace automatically, a Namespace must be created first:

kubectl create ns kydra

Switch between Namespaces:

kubens shield

There are 2 ways to deploy objects to a specific Namespace:

imperatively - requires you to add the -n or --namespace flag to commands
declaratively - requires you to specify the Namespace in the YAML

Delete Pods:

kubectl detele -f shield.app.yml

Delete Namespace:

kubectl delete ns shield

6: Kubernetes Deployments

Use Deployments to bring cloud-native features such as self-healing, scaling, rolling updates, and versioned rollbacks to stateless apps on Kubernetes.

Kubernetes offers several controllers that augment Pods with important capabilities. The Deployment controller is designed for stateless apps.

The Deployment spec is a declarative YAML object where you describe the desired state of a stateless app. The controller element operates as a backgrounds loop on the control plane, reconciling observed state with desired state.

You start with a stateless application, package it as a container, then define it in a Pod template. At this point you have a static Pod - it does not self-heal, autoscale or is easy to update. That is why you almost always wrap them in a Deployment object.

A Deployment object only manages a single Pod template.

Deployments rely heavily on a ReplicaSet. Replica Sets manage Pods and bring self-healing and scaling. Deployments manage ReplicaSet and add rollouts and rollbacks. It is not recommended to manage ReplicaSets directly. Think of Deployments as managing ReplicaSets, and ReplicaSets as managing Pods.

Deployments:

if Pods managed by a Deployment fail, they will be replaced (self-healing)
if Pods managed by a Deployment see increased or decreased load, they can be scaled

3 concepts fundamental to everything about Kubernetes:

desired state (what you want)
observed state (what you have)
reconciliation (if desired state != observed state, a process of reconciliation attempts to bring observed state into sync with desired state)

Declarative model is a method of telling Kubernetes your desired state, while avoiding the detail of how to implement it. You leave the how up to Kubernetes.

Zero-downtime rolling-updates of stateless apps are what Deployments are about. They require a couple of things from your microservice applications in order to work properly:

loose coupling via APIs
backwards and forwards compatibility

Each Deployment describes all the following:

how many Pod replicas
what images to use for the Pod's containers
what network ports to expose
details about how to perform rolling updates

Deploying a new version: update the dame Deployment YAML file with the new image version and re-post it to the API server.

Rollback: you wind one of the old ReplicaSets up while you wind the current one down.

Kubernetes gives you fine-grained control over how rollouts and rollbacks proceed - insert delays, control the pace and cadence of releases, you can probe the health and status of updated replicas.

YAML components:

apiVersion: apps/v1 - Deployments available in the apps/v1 subgroup
kind: Deployment - Deployment object
metadata.name: hello-deploy - a valid DNS name
spec - anything nested below spec relates to the Deployment
spec.templates - the Pod template Deployments uses to stamp out Pod replicas
spec.replicas - how many Pod replicas the Deployment should create and manage
spec.selector - a list of labels that Pods must have in order for Deployments to manage them. This tells Kubernetes which Pods to terminate and replace when performing the rollout.
spec.revisionHistoryLimit - how many older versions/ReplicaSets to keep
spec.progressDeadlineSeconds - tells Kubernetes how long to wait during a rollout for each new replica to come online
spec.strategy - tells the Deployment controller how to upgrade the Pods when a rollout occurs
- update using the Rolling Update strategy
- never have more than one Pod below desired state (maxUnavailable: 1) - you will never have less than 9 replicas during the update process
- never have more than one Pod above desired state (maxSurge: 1) - never have more than qq replicas during the update process
- net result - update two Pods at a time, the delta between 9 and 11 is 2

spec:
  replicas: 10
  selector:
    matchLabels:
      app: hello-world
  revisionHistoryLimit: 5
  progressDeadlineSeconds: 300
  minReadySeconds: 10
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 1
  template:
    metadata:
      labels:
        app: hello-world
    spec:
      containers:
        - name: hello-pod
          image: nigelpoulton/k8sbook:2.0
          ports:
            - containerPort: 8080

Deploy to the cluster:

kubectl apply -f deploy.yml

kubectl get deploy hello-deploy

kubectl describe deploy hello-deploy

kubectl get replicaset

kubectl describe replicaset hello-deploy-5cd5dcf7d7

In order to access a web app from a stable name or IP address, or even from outside the cluster, you need a Kubernetes service object. A Service provide reliable networking for a set of Pods.

Scaling the number of replicas manually - edit the YAML and set a different number of replicas or use the command:

kubectl scale deploy hello-deploy --replicas 5

Performing a rolling update (by replacement because Pods are immutable):

kubectl apply -f deploy.yml

kubectl rollout status deployment hello-deploy

Pausing & resuming deployment:

kubectl rollout pause deploy hello-deploy

kubectl rollout resume deploy hello-deploy

Detailed deployment info:

kubectl describe deploy hello-deploy

Kubernetes maintains a documented revision history of rollouts:

kubectl rollout history deployment hello-deploy

Rolling Updates create new ReplicaSets, old ReplicaSets aren't deleted. The fact the old ones still exist makes them ideal for executing rollbacks:

kubectl rollout undo deployment hello-deploy --to-revision=1

Modern versions of Kubernetes use the system generated pod-template-hash label so only Pods that were originally created by the Deployment/ReplicaSet will be managed:

kubectl get pods --show-labels

7: Kubernetes Services

Controllers add self-healing, scaling and rollouts. Despite all of this, Pods are still unreliable, and you should never connect directly to them.

Services provide stable and reliable networking for a set of unreliable Pods. Every Service gets its onw stable IP address, its own DNS name, and its own stable port. The Service fronts the Pods with a stable UP, DNS, and port. It also load-balances traffic to Pods with the right labels.

With a Service in place, the Pods can scale up/down, they can fail, and they can be updated and rolled back. Despite all of this, clients will continue to access them without interruption. The Service is observing the changes and updating its lists of healthy Pods it sends traffic to.

Think of Services as having a static front-end and a dynamic back-end.

Services are loosely coupled with Pods via labels and selectors. This is ihe same technology that loosely couples Deployments to Pods.

Every time you create a Service, Kubernetes automatically creates an associated Endpoints object. The Endpoints object is used to store a dynamic list of healthy Pods matching the Service's label selector. Any new Pods that match the selector get added to the Endpoints object.

Types of Services:

accessible from inside the cluster
- ClusterIP - default type, a stable virtual IP, every service you create gets a ClusterIP
accessible from outside the cluster
- NodePort - built on top of CLusterIP and allow external clients to hit a dedicated port on every cluster node and reach the Service
- LoadBalancer- make external access even easier by integrating with an internet-facing load-balancer on your underlying cloud platform

Example Service object:

spec:
  type: NodePort
  ports:
    - port: 8080       -- listen internally on port 8080
      nodePort: 30001  -- listen externally on 30001
      targetPort: 8080 -- forward traffic to the application Pods on port 8080
      protocol: TCP    -- use TCP (default)
  selector: -- send traffic to all healthy Pods on the cluster with the following metadata.labels
    chapter: services

Get Endpoint object:

kubectl get endpointslices

Get details of each healthy Pods:

kubectl describe endpointslice svc-test-xgnsv

If your cluster is on a cloud platform, deploying a Service with type=LoadBalancer will provision one of your cloud's internet-facing load-balancers and configure it to send traffic to your Service.

kubectl get svc --watch

After ~2 minutes the value in the EXTERNAL-IP column will appear.

Delete multiple resources:

kubectl delete -f deploy.yml -f lb.yml -f svc.yml

8: Ingress

Ingress is all about accessing multiple web applications through a single LoadBalancer Service.

Load Balancer refers to a Kubernetes Service object of type=LoadBalancer
load-balancer refers to the internet-facing load-balancer on the underlying cloud

Ingress exposes multiple Services through a single cloud load-balancer. Cloud load-balancers are expensive.

kubectl get ing

Ingress classes allow you to run multiple Ingress controllers on a single cluster:

assign each Ingress controller to an Ingress class
when you create Ingress object, you assign them to an Ingress class

kubectl get ingressclass

Ingress is a way to expose multiple applications and Kubernetes Services via a single cloud load-balancer. They are stable objects in the API but have feature overlap with a lot of service meshes - if you are running a service mesh you may not need Ingress.

9: Service discovery deep dive

Finding stuff on a crazy-busy platform like Kubernetes is hard. Service discovery makes it simple. Apps need a way to find the other apps they work with.

2 components to service discovery:

registration - is the process of an application listing its connection details in a service registry so other apps can find it and consume it. Kubernetes uses its internal DNS as a service registry. All Kubernetes Services are automatically registered with DNS.
discovery - for service discovery to work, apps need to know to the name of the Service fronting the apps they want to connect to (rast is taken care of by Kubernetes)

Get Pods running the cluster DNS:

kubectl get pods -n kube-system -l k8s-app=kube-dns

Service discovery works like a typical routing - check your own table, if not found pass it to the next one.

Domain name format: object-name.namespace.svc.cluster.local, object name has to be unique within a Namespace, but not across Namespaces.

10: Kubernetes storage

Kubernetes supports lots of types of storage from lots of different places. No matter what type of storage, or where is comes from, when it is exposed on Kubernetes it is called a volume. All that's required is a plugin allowing their storage resources to be surfaced as volumes in Kubernetes.

Container Storage Interface - an open standard aimed at providing a clean storage interface for container orchestrators such as Kubernetes.

Core storage-related API objects:

Persistent Volumes - are how external storage assets are represented in Kubernetes
Persistent Volume Claims - like tickets that grant access to a PV
Storage Classes - makes it all dynamic

Storage Providers - AWS Elastic Block Store, Azure File, NFS volumes, ...

The CSI is a vital place of the Kubernetes storage, however, unless you are a developer writing a storage plugins, you are unlikely to interact with it very often.

Working with Storage Classes:

Create one or more StorageClasses on Kubernetes
Deploy Pods with PVCs that reference those Storage Classes

Other settings:

Access mode:
- ReadWriteOnce - a PV that can be only bound as R/W by a single PVC
- ReadWriteMany - a PV that can be bound as R/W by multiple PCVs
- ReadOnlyMany - a PV that can be bound as R/O by multiple PVCs
Reclaim policy - how to deal with a PV when its PVC is released:
- Delete - it deletes the PV and associated storage resource on the external storage system
- Retain - keep the associated PV object on the cluster as well as any data stored on the associated external asset

kubectl get sc

kubectl get pv

kubectl get pvc

11: ConfigMaps and Secrets

Most apps comprise two main parts: the app & the configuration. Coupling the application and the configuration into a single easy-to deploy unit is an anti-pattern. De-coupling the application and the configuration has the following benefits:

re-usable application images (you can use the same image on dev, staging, prod)
simpler development and testing (easier to spot a mistake when the app and the config are decoupled, e.g. app crash after config change)
simpler and fewer disruptive changes

Kubernetes provides an object called a ConfigMap that lets you store configuration data outside a Pod. It also makes it easy to inject config into Pods at run-time.

You should not use ConfigMaps to store sensitive data such as certificates and passwords. Kubernetes provides a different object, called a Secret, for storing sensitive data.

Behind the scenes, ConfigMaps are a map of key-value pairs, and we call each pair an entry:

Keys - an arbitrary name that can be created from alphanumerics, dashes, dots, and underscores
Values - anything, including multiple lines with carriage returns
Keys and Values are separated by a colon -- key:value

Data in a ConfigMap, can be injected into containers at run-time via any of the following methods:

environmental variables (static variables, updates made to the map don't get reflected in running containers, major reason not to use environmental variables)
arguments to the container's startup command (the most limited methods, shares environmental variables' limitations)
files in a volume (the most flexible method)

ConfigMap object don'§t have the concept of state (desired/actual) - this is why they have a data block instead of spec and status blocks.

Creating a ConfigMap declaratively:

kind: ConfigMap
apiVersion: v1
metadata:
  name: multimap
data:
  given: Nigel
  family: Poulton

kubectl apply -f multimap.yml

ConfigMaps are extremely flexible and can be used to insert complex configurations, including JSON files and even scripts, into containers at run-time.

View logs from a pod from a container:

kubectl logs startup-pod -c args1

ConfigMaps with volumes is the most flexible option. You can reference entire configuration files, as well as make updates to the ConfigMap that will be reflected in running containers.

Create the ConfigMap
Create a ConfigMap volume in the Pod template
Mount the ConfigMap volume into the container
Entries in the ConfigMap will appear in the container as individual files

Update to a ConfigMap via re-applying ConfigMap YML.

Check ENV variable value:

kubectl exec cmvol -- cat /etc/name/given

Secrets are almost identical to ConfigMaps - they hold application configuration data that is injected into containers at run-time. Secrets are designed for sensitive data such as passwords, certificates, and OAuth tokens.

Despite being designed for sensitive data, Kubernetes does not encrypt Secrets in the cluster store. Fortunately, it is possible to configure encryption-ar-rest with EncryptionConfiguration objects. Despite this, many people opt to use external 3rd-party tools, such as HasiCorp Vault.

A typical workflow for a Secret is as follows:

The Secret is created and persisted to the cluster store as an un-encrypted object
A Pod that uses it gets scheduled to cluster node
The Secret is transferred over the network, un-encrypted, to the node
The kubelet on the node starts the Pod and its containers
The Secret is mounted into the container via in-memory tmpfs filesystem and decoded from base64 to plain text
The application consumes it
When the Pod is deleted, the Secret is deleted from the node

kubectl get secrets

Create a Secret manually:

kubectl create secret generic creds --from-literal user=piotr --from-literal pwd=qwerty

Decode base-64:

echo cGlvdHI= | base64 -d

apiVersion: v1
kind: Secret
metadata:
  name: tkb-secret
  labels:
    chapter: configmaps
type: Opaque
data: -- stringData when using plaintext
  username: bmlnZWxwb3VsdG9u
  password: UGFzc3dvcmQxMjM=

The most flexible way to inject a Secret into a Pod is via a special type of volume called a Secret volume. Secret vols are automatically mounted as read-only to prevent containers and applications accidentally mutating them.

12: StatefulSets

Stateful application - application that creates and saves valuable data, for example an app that saves data about client sessions and uses it for future sessions, or a database.

StatefulSets guarantee:

predictable and persistent Pod names
- name format: StatefulSetName-Integer
predictable and persistent DNS hostnames
predictable and persistent volume bindings

Failed Pods managed by a StatefulSet will be replaced by new Pods with the exact same Pod name, the exact same DNS hostname, and the exact same volumes. This is true even if the replacement is started on a different cluster node. The same is not true of Pods managed by a Deployment.

StatefulSets create one Pod at a time, and always wait for previous Pods to be running and ready before creating the next.

Knowing the order in which Pods will be scaled down, as well as knowing that Pods will not be terminated in parallel, is a game-changer for many stateful apps.

Note: deleting a StatefulSet object does not terminate Pods in order, with this in mind, you may want to scale down a StatefulSet to 0 replicas before deleting it.

Headless Service is a regular Kubernetes Service object without an IP address. It becomes a StatefulSet's Governing Service when you list it in the StatefulSet config under spec.serviceName.

StatefulSets are only a framework. Applications need to be written in ways to take advantage of the way StatefulSets behave.

13: API security and RBAC

Kubernetes is API-centric and the API is served through the API server.

Authentication (authN = "auth en") is about providing your identity. All requests to the API server have to include credentials, and the authentication layer is responsible for verifying them. The authentication layer in Kubernetes is pluggable, and popular modules include integration with external identity management systems such as Amazon Identity Access Management.

In fact, Kubernetes forces you to use external identity management system.

Cluster details and credentials are stored in a kubeconfig file.

kubectl config view

Authorization (authZ - "auth zee") - RBAC (Role-Based Access Control) - happens immediately after successful authorization. It is about three things: users, actions, resources. Which users can perform which actions agains which resources.

Admission Control runs immediately after successful Authentication and Authorization and is all about policies. There are 2 types of admission controllers: mutating (check for compliance and can modify requests) and validating (check for policy compliance, without request modification).

Most real-world clusters will have a lot of admission controllers enabled. Example: a policy to require env=prod label, admission control can verify presence and add a label when it is missing.

14: The Kubernetes API

Kubernetes is API centric. This means everything in Kubernetes is about the API, and everything ges through the API and API server. For most part, you will use kubectl to send requests, however you can craft them in code.

kubectl proxy --port 9000 &

curl http://localhost:9000/api/v1/pods

The Kubernetes API is divided into 2 groups:

the core group - mature objects that were created in the early dats of Kubernetes before the API was divided into groups, located in api/v1
the named group - the future of the API, all new resources get added to named groups

kubectl api-resources

Kubernetes has a strict process for adding new resources to the API. They come in as alpha (experimental, can be buggy), progress through beta (pre-release), and eventually reach stable.

It is possible to write your custom controller or resource.

15: Threat modeling Kubernetes

Threat modeling is the process of identifying vulnerabilities. The STRIDE model:

Spoofing
- pretending to be somebody else with the aim of gaining extra privileges on a system
Tampering
- the act of changing something in a malicious way, so you can cause one of the following:
  - denial of service - tampering with the resource to make it unusable
  - elevation of privilege - tampering with a resource to gain additional privileges
Repudiation
- creating doubt about something, non-repudiation is proving certain actions were carried out by certain individuals
Information disclosure
- when sensitive data is leaked
Denial of service
- making something unavailable, many types of DoS attacks, but a well-known variation is overloading system to the point it can no longer service requests
Elevation of privilege
- gaining higher access than what is granted, usually in order to cause damage or gain unauthorized access

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!