You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
when a container runtime like Docker, as well as some of the new ones we have been working on—podman, CRI-O, and Buildah—create a container, they pick a random MCS label to run the container. The MCS labels consist of two random numbers between 0 and 1,023 and have to be unique. They are prefixed with a c or category. SELinux also needs a sensitivity level s0.
So an MCS label looks like s0:c1,c2. Note that s0:c2,c1 is the same thing. Also, the two numbers may not be the same; SELinux would translate s0:c1,c1 as s0:c1. This gives us approximately (1024*1024)/2 - 1024 categories—about 500,000 unique containers on a host.
We originally created MCS labeling back in 2008 for virtual machines, and it was often referred to as sVirt. We figured that running a half-million VMs on a single machine would not happen for a few years. With containers, the number might end up being threatened. But we could always go to three or more categories for each label, although the algorithm becomes more complicated.
SELinux does more than just MCS label. The process and content also get assigned SELinux "Types." Processes usually run with the container_t type, and content is created with the container_file_t type.
Process system_u:system_r:container_t:s0:c1,c2
Content system_u:object_r:container_file_t:s0:c1,c2
oc get pod -o jsonpath='{range .items[*]}{@.metadata.name}{" runAsUser: "}{@.spec.containers[*].securityContext.runAsUser}{" fsGroup: "}{@.spec.securityContext.fsGroup}{" seLinuxOptions: "}{@.spec.securityContext.seLinuxOptions.level}{"\n"}{end}'
[...]
As it can be seen from the previous output, all the Pods in the same namespaces are running with the same UID, GID and SELinux labels. Notice these are unprivileged Pods running with an unprivileged UID & GID.
k8s: PodSecurityPolicy -> [simplified] Pod Security Admission
OpenShift: ~PodSecurityPolicy && k8s Pod Security Admission
n OpenShift, there is an OpenShift-specific dedicated pod admission system called Security Context Constraints. This system resembles the now deprecated PodSecurityPolicy admission, even though there have been many changes throughout the years of its existence. Our aim is to keep the Security Context Constraints pod admission system while also allowing users to have access to the Kubernetes Pod Security Admission. The following text describes what we did in order to make it possible in 4.11, and what we plan to do next in 4.12.
support for running Kata Containers as an additional optional runtime. The new runtime supports containers in dedicated virtual machines (VMs), providing improved workload isolation.
While CGroups and Namespaces are a powerful way of defining isolation between applications, faults have been found that allow breaking out of their CGroups jail. Additional measures such as SELinux can assist with keeping applications inside their container, but sometimes your application or workload needs more isolation than CGroups, Namespaces, and SELinux can provide.
There are multiple proposed solutions to this isolation challenge including Amazon Firecracker, gVisor, and Kata Containers. Google’s gVisor takes one approach to solve this problem and leverages a guest kernel in user space to sandbox containerized applications. Because gVisor is re-implementing all the Linux kernel syscalls there can be issues with compatibility and not all syscalls have been fully implemented. Alternatively, both Firecracker and Kata Containers leverage a tried and true technology, virtualization to create a complete sandbox around your containerized application. While Kata can work on a stand-alone machine, by directly integrating with containerd or CRI-O, it is mainly used as a part of a Kubernetes cluster.
The use of Virtualization may seem like abandoning the last 5+ years of progress with containers, but this is not really the case. Kata Containers (Kata) creates a virtual machine instance leveraging one of the four supported hypervisors however they are not your traditional virtual machines. Kata Containers creates a VM using a highly optimized Linux guest kernel designed for running containerized workloads and has a highly optimized boot path for quick start time. Boot times for these virtual machine instances can be under 5 seconds as can be seen in the kernel boot log:
The text was updated successfully, but these errors were encountered:
https://cloud.redhat.com/blog/pod-security-admission-in-openshift-4.11
https://cloud.redhat.com/blog/red-hat-releases-open-source-stackrox-to-the-community
The text was updated successfully, but these errors were encountered: