From ba195b6f134154504bc53ec65587309554e41032 Mon Sep 17 00:00:00 2001 From: Shingo Omura Date: Mon, 17 Oct 2022 12:34:14 +0900 Subject: [PATCH 1/4] First draft of KEP-3169: Fine-grained SupplementalGroups control This KEP roughly introduces belows in Kubernetes API: - 'PodSecurityContext.SupplementalGroupsPolicy' to control which groups are attached to the container process, and - 'ContainerStatus.User' so that user know which identities(uid, gid, supplemental groups) are ACTUALLY attached to the container process. The corresponding changes are also proposed in CRI. Co-authored-by: Sergey Kanzhelev --- .../3619-supplemental-groups-policy/README.md | 1011 +++++++++++++++++ .../3619-supplemental-groups-policy/kep.yaml | 44 + 2 files changed, 1055 insertions(+) create mode 100644 keps/sig-node/3619-supplemental-groups-policy/README.md create mode 100644 keps/sig-node/3619-supplemental-groups-policy/kep.yaml diff --git a/keps/sig-node/3619-supplemental-groups-policy/README.md b/keps/sig-node/3619-supplemental-groups-policy/README.md new file mode 100644 index 00000000000..55e02dbc7a6 --- /dev/null +++ b/keps/sig-node/3619-supplemental-groups-policy/README.md @@ -0,0 +1,1011 @@ +# KEP-3619: Fine-grained SupplementalGroups control + + + + +- [Release Signoff Checklist](#release-signoff-checklist) +- [Summary](#summary) + - [The issue](#the-issue) + - [Steps to reproduce](#steps-to-reproduce) +- [Motivation](#motivation) + - [Goals](#goals) + - [Non-Goals](#non-goals) +- [Proposal](#proposal) + - [Kubernetes API](#kubernetes-api) + - [SupplementalGroupsPolicy in PodSecurityContext](#supplementalgroupspolicy-in-podsecuritycontext) + - [User in ContainerStatus](#user-in-containerstatus) + - [CRI](#cri) + - [SupplementalGroupsPolicy in SecurityContext](#supplementalgroupspolicy-in-securitycontext) + - [user in ContainerStatus](#user-in-containerstatus-1) + - [User Stories](#user-stories) + - [Story 1: Deploy a Security Policy to enforce SupplementalGroupsPolicy field](#story-1-deploy-a-security-policy-to-enforce--field) + - [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional) + - [Risks and Mitigations](#risks-and-mitigations) +- [Design Details](#design-details) + - [Kubernetes API](#kubernetes-api-1) + - [SupplementalGroupsPolicy in PodSecurityContext](#supplementalgroupspolicy-in-podsecuritycontext-1) + - [User in ContainerStatus](#user-in-containerstatus-2) + - [CRI](#cri-1) + - [SupplementalGroupsPolicy in SecurityContext](#supplementalgroupspolicy-in-securitycontext-1) + - [user in ContainerStatus](#user-in-containerstatus-3) + - [Test Plan](#test-plan) + - [Prerequisite testing updates](#prerequisite-testing-updates) + - [Unit tests](#unit-tests) + - [Integration tests](#integration-tests) + - [e2e tests](#e2e-tests) + - [Graduation Criteria](#graduation-criteria) + - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) + - [Version Skew Strategy](#version-skew-strategy) +- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire) + - [Feature Enablement and Rollback](#feature-enablement-and-rollback) + - [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning) + - [Monitoring Requirements](#monitoring-requirements) + - [Dependencies](#dependencies) + - [Scalability](#scalability) + - [Troubleshooting](#troubleshooting) +- [Implementation History](#implementation-history) +- [Drawbacks](#drawbacks) +- [Alternatives](#alternatives) + - [Introducing RutimeClass](#introducing-) + - [Adjusting container image by users](#adjusting-container-image-by-users) + - [Just fixing CRI implementations](#just-fixing-cri-implementations) +- [Infrastructure Needed (Optional)](#infrastructure-needed-optional) + + +## Release Signoff Checklist + + + +Items marked with (R) are required *prior to targeting to a milestone / release*. + +- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) +- [ ] (R) KEP approvers have approved the KEP status as `implementable` +- [ ] (R) Design details are appropriately documented +- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors) + - [ ] e2e Tests for all Beta API Operations (endpoints) + - [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) + - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free +- [ ] (R) Graduation criteria is in place + - [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) +- [ ] (R) Production readiness review completed +- [ ] (R) Production readiness review approved +- [ ] "Implementation History" section is up-to-date for milestone +- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] +- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes + + + +[kubernetes.io]: https://kubernetes.io/ +[kubernetes/enhancements]: https://git.k8s.io/enhancements +[kubernetes/kubernetes]: https://git.k8s.io/kubernetes +[kubernetes/website]: https://git.k8s.io/website + +## Summary + +The KEP seeks to provide a way to choose correct behavior with how Container Runtimes (Containerd and CRI-O) are applying `SupplementalGroups` to the first container processes. The KEP describes the work needed to be done in Kubernetes or connected projects to make sure customers have a clear migration path - including detection and safe upgrade - if any of their workflows took a dependency on this arguably erroneous behavior. + +### The issue + +How supplemental groups attached to the container processes are defined in two levels in Kubernetes, one is OCI image level and the other is Kubernetes API level. + +In [OCI image spec](https://github.com/opencontainers/image-spec), [`config.User` OCI image configuration](https://github.com/opencontainers/image-spec/blob/3a7f492d3f1bcada656a7d8c08f3f9bbd05e7406/config.md#:~:text=User%20string%2C%20OPTIONAL)(mirrored spec of [`USER` directive in `Dockerfile`](https://docs.docker.com/engine/reference/builder/#user)) is defined as follows: + +> The username or UID which is a platform-specific structure that allows specific control over which user the process run as. This acts as a default value to use when the value is not specified when creating a container. For Linux based systems, all of the following are valid: `user`, `uid`, `user:group`, `uid:gid`, `uid:group`, `user:gid`. If `group`/`gid` is not specified, the default group and supplementary groups of the given `user`/`uid` in `/etc/passwd` from the container are applied. + +In Kubernetes API level, `PodSecurityContext.{RunAsUser, RunAsGroup, SupplementalGroups}` relates to this. This API was designed to override `config.User` configuration of OCI images. However, in the current implementation, as described in [kubernetes/kubernetes#112879](https://github.com/kubernetes/kubernetes/issues/112879), even when a manifest defines both `RunAsGroup`, __group memberships defined in the container image for the UID__ are attached to the container process (see the [the next section](#steps-to-reproduce) for details). This behavior clearly diverges from the specification of OCI image configuration, especially the next sentence of [`config.User` OCI image configuration](https://github.com/opencontainers/image-spec/blob/3a7f492d3f1bcada656a7d8c08f3f9bbd05e7406/config.md#:~:text=User%20string%2C%20OPTIONAL)): + +> If `group`/`gid` is not specified, the default group and supplementary groups of the given `user`/`uid` in `/etc/passwd` from the container are applied. + +As described in [kubernetes/kubernetes#112879](https://github.com/kubernetes/kubernetes/issues/112879), the behavior is not documented well and is not widely known by most Kubernetes administrators and users. Moreover, this behavior causes security considerations in some cases. + +### Steps to reproduce + +Assume you have an image and a Pod manifest: + +```Dockerfile +# Dockerfile +FROM ubuntu:22.04 +# This generates /etc/group entry --> "group-in-image:x:50000:alice" +RUN groupadd -g 50000 group-in-image \ + && useradd -m -u 1000 alice \ + && gpasswd -a alice group-in-image +USER alice +``` + +```yaml +spec: + # This overrides + # - USER directive in Dockerfile above by runAsUser and runAsGroup with "1000:1000", and + # - setting supplementalGroups + # This spec expects NOT to attach gids defined in the image(/etc/group) to the container process + # because this specifies gid by runAsGroup explicitly. + securityContext: { runAsUser:1000, runAsGroup:1000, supplementalGroups:[60000]} + containers: + # Expected output: "uid=1000(alice) gid=1000(alice) groups=1000(alice),60000" + # NOTE: "group-in-image" is not included here + # because groups defined in /etc/group should not be attached + # when gids is specified in runAsGroup + - image: the-image-above + sh: ["id"] +``` + +However, the current combination with Kubernetes and major container runtimes(at least containerd and cri-o) outputs(See [here](https://github.com/pfnet-research/strict-supplementalgroups-container-runtime/tree/reproduce-bypass-supplementalgroups) for more detailed reproduction code) includes "group-in-image" group of the first container process. + +```console +uid=1000(alice) gid=1000(alice) groups=1000(alice),50000(group-in-image),60000 +``` + +## Motivation + +As described above, how supplemental groups attached to the first container process is complicated and not OCI image spec compliant. + +Moreover, this causes security considerations as follows. When a cluster enforces some security policy for pods that protects the value of `RunAsGroup` and `SupplementalGroups`, the effect of its enforcement is limited, i.e., cluster users can easily bypass the policy enforcement just by using a custom image. If such a bypass happened, it would be unexpected behavior for most cluster administrators because the enforcement is almost useless. Moreover, the bypass will cause unexpected file access permission. In some use cases, the unexpected file access permission will be a security concern. For example, using `hostPath` volumes could be a severe problem because UID/GIDs matter in accessing files/directories in the volumes. + +Kubernetes provides no API surface to prevent this bypass although it could sometimes lead to a security concern. Because the behavior is implemented in CRI implementations actually, To mitigate this, the cluster administrators will need to deploy a custom low-level container runtime(e.g., [pfnet-research/strict-supplementalgroups-container-runtime](https://github.com/pfnet-research/strict-supplementalgroups-container-runtime)) that modifies OCI container runtime spec(`config.json`) produced by CRI implementations (e.g., containerd, cri-o). A custom `RuntimeClass` would be introduced for it. Nevertheless, It would be an extra operational burden for cluster administrators. + +Thus, this KEP proposes to offer a new API field named `SupplementalGroupsPolicy` that enables users to control supplemental groups attached to the first container process by following "principle of least surprise". The new API allows cluster administrators to deploy security policies that protect the `SupplementalGroupsPolicy` field in the cluster to avoid the unexpected bypass of `SupplementalGroups` described above. This KEP also proposes a way for users to detect which groups are _actually_ attached to container processes. This helps users/administrators identify which pods have _unexpected_ group permissions and choose the best `SupplementalGroupsPolicy` for them. + +### Goals + + + +- To Provide a new API field to control exactly which groups the container process belongs to +- Ensure there are clear steps documented for end users to detect if their workload is affected +- (Optional) provide helper APIs and/or tooling to simplify the detection + +### Non-Goals + + + +- To provide a cluster-wide control method. +- To change the default behavior (a potentially breaking change) + +## Proposal + + + +This KEP proposes changes both on Kubernets API and CRI levels. + +### Kubernetes API + +_See also [Alternatives](#alternatives) section for rejected alternative plans._ + +#### SupplementalGroupsPolicy in PodSecurityContext + +A new field named `SupplementalGroupsPolicy` will be introduced to `PodSecurityContext`. This field defines how supplemental groups of the first container process are calculated. + +Allowed values are: + +- `Merge`(_default if not specified_): This policy _always_ merges the provided `SupplementalGroups`(including `FsGroup`) with groups of the primary user from the image(`/etc/group` in the image). + - Note: The primary user is specified with `RunAsUser`. If not specified, the user from the image config is used. Otherwise, the runtime default is used. +- `Strict`: This policy uses _only_ the provided `SupplementalGroups`(including `FsGroup`) as supplemental groups for the first container process. No groups from the image are extracted. + +Note that both policies diverge from the semantics of [`config.User` OCI image configuration](https://github.com/opencontainers/image-spec/blob/3a7f492d3f1bcada656a7d8c08f3f9bbd05e7406/config.md#:~:text=User%20string%2C%20OPTIONAL). The purpose is to follow "principle of least surprise" as described in the previous section. + +#### User in ContainerStatus + +To provide users/administrators to know which identities are actually attached to the container process, it proposes to introduce new `User` field in `ContainerStatus`. `User` is an object which consists of `Uid`, `Gid`, `SupplementalGroups` fields for linux containers. This will help users to identify unexpected identities. This field is derived by CRI response (See [user in ContainerStatus](#user-in-containerstatus-1) section). + +### CRI + +#### SupplementalGroupsPolicy in SecurityContext + +Symmetrical changes are needed. See [Design Details](#design-details) section. + +#### user in ContainerStatus + +To propagate identities of the container process to `ContainerStatus` in Kubernetes API, CRI changes would be needed. This proposes to define `ContainerUser` data type and add `user` field to `ContainerStatus` that is used in the response of `ContainerStatus` method. `ContainerUser` consists of `Uid`, `Gid` and `SupplementalGroups` fields. + +```protobuf +// service RuntimeService { +// rpc ContainerStatus(ContainerStatusRequest) returns (ContainerStatusResponse) {} +// ... +// } +// message ContainerStatusResponse { +// ContainerStatus status = 1; +// ... +// } + +message ContainerStatus { + ... + // user information of the container process + ContainerUser user = ?; +} + +message ContainerUser { + // details in "Design Details" section +} +``` + +### User Stories + + + +#### Story 1: Deploy a Security Policy to enforce `SupplementalGroupsPolicy` field + +Assume a multi-tenant kubernetes cluster with `hostPath` volumes below situations: + +- Multi-tenant model is namespace-based (namespace per tenant(user/group) model) + - access to each namespace is controlled by RBAC +- PSP(or other policy engines) is enforced in each namespace which protects + - `runAsUser`, `runAsGroup`, `fsGroup`, `supplementalGroups` values +- A `hostPath` volume (say `/mnt/hostpath`) is maintained in all the nodes by administrators + - with permission `drwxr-xr-x nobody nogroup /mnt/hostpath` + - the directory mounts an NFS volume shared by all the tenants, and UIDs/GIDs are managed by the cluster admininistrators + - Any tenant CAN create a directory under this directory +- There is a `/mnt/hostpath/private-to-gid-60000` which is fully private to `gid=60000` + - i.e. its permission is `drwxrwx--- nobody 60000 /mnt/hostpath/private-to-gid-60000` +- There is `user-alice` namespace for `alice(uid=1000)`, and `alice` only belongs a `group-a(gid=50000)` +- cluster administrator enforces a policy for Pods with `/mnt/hostpath` `hostPath` volumes in `user-alice` namespace such that + - `runAsUser, runAsGroup` must be `1000` + - `supplementalGroups` must be `[60000]` + - `fsGroup` must be one of `1000, 60000` + - i.e. cluster administrator expects that all the container processes can only have `60000` as supplementary groups in `user-alice` namespace + +As described in [Summary](#summary) section, `alice` can bypass the restriction by using a custom image. To mitigate the scenario, cluster administrators can deploy a security policy restricting `supplementalGroupsPolicy` in `user-alice` namespace such that: + - `runAsUser, runAsGroup` must be `1000` + - `supplementalGroups` must be `[60000]` + - _this is not enough to avoid bypassing supplementary groups for container processes_ + - __`supplementalGroupsPolicy` must be `Strict`__ + - __this really needs to avoid the bypass completely__ + - `fsGroup` must be one of `1000, 60000` + +Please note that a security policy without `supplementalGroupsPolicy` would lead to unexpected groups for the first process in the containers. + +### Notes/Constraints/Caveats (Optional) + + + +The proposal affects to the CRI implementations (e.g., containerd, cri-o, gVisor, etc.) + +### Risks and Mitigations + + + +- How to track the support status in CRI implementations of this proposal? + - This feature is mainly implemented inside each CRI implementation. +- How to feature-gate this feature in CRI implementations? + +## Design Details + +### Kubernetes API + +#### SupplementalGroupsPolicy in PodSecurityContext + +A new field named `SupplementalGroupsPolicy` will be introduced to `PodSecurityContext`: + +```go +type PodSecurityContext struct { + ... + // A list of groups applied to the first process run in each container. + // supplementalGroupsPolicy can control how groups will be calculated. + // Note that this field cannot be set when spec.os.name is windows. + // +optional + SupplementalGroups []int64 + // supplementalGroupsPolicy defines how supplemental groups of the first + // container processes are calculated. + // Valid values are "Merge" and "Strict". + // If note specified, "Merge" is used. + // Note that this field cannot be set when spec.os.name is windows. + // +optional + SupplementalGroupsPolicy *PodSecurityGroupsPolicy +} + +type PodSecurityGroupsPolicy string +const ( + // SecurityGroupsPolicyMerge policy always merges + // the provided SupplementalGroups (including FsGroup) + // with groups of the primary user from the container image(`/etc/group`). + // Note: The primary user is specified with RunAsUser. + // If not specified, the user from the image config is used. + // Otherwise, the runtime default is used. + SecurityGroupsPolicyMerge PodSecurityGroupsPolicy = "Merge" + + // SecurityGroupsPolicyStrict policy uses only + // the provided SupplementalGroups(including FsGroup) + // as supplemental groups for the first container process. + // No groups extracted from the container image. + SecurityGroupsPolicyStrict PodSecurityGroupsPolicy = "Strict" +) +``` + +#### User in ContainerStatus + +```golang +type ContainerStatus struct { +... + // User indicates identities of the container process + User ContainerUser +} +``` + +```golang +type ContainerUser struct { + // Linux holds identity information of the process of the containers in Linux. + // Note that this field cannot be set when spec.os.name is windows. + Linux *LinuxContainerUser + + // Windows holds identity information of the process of the containers in Windows + // This is just reserved for future use. + // Windows *WindowsContainerUser +} + +type LinuxContainerUser struct { + // Uid is the primary uid of the container process + Uid int64 + // Gid is the primary gid of the container process + Gid int64 + // SupplementalGroups are the supplemental groups attached to the container process + SupplementalGroups []int64 +} + +// This is just reserved for future use. +// type WindowsContainerUser struct { +// T.B.D. +// } +``` + +### CRI + +#### SupplementalGroupsPolicy in SecurityContext + +cri-spec (`v1`) also needs to be updated similarly as follows. Comments are omitted because they are symmetric to Pods' one. + +```proto +enum SupplementalGroupsPolicy { + Merge = 0; + Strict = 1; +} + +message LinuxContainerSecurityContext { +... + repeated int64 supplemental_groups; + optional SupplementalGroupsPolicy supplemental_groups_policy; +} + +message LinuxSandboxSecurityContext { +... + repeated int64 supplemental_groups; + optional SupplementalGroupsPolicy supplemental_groups_policy; +} +``` + +#### user in ContainerStatus + +```protobuf + +message ContainerStatus { + ... + // User holds user information of the container process + ContainerUser user = ??; +} + +message ContainerUser { + // User information of Linux containers. + LinuxContainerUser linux = 1; + // User information of Windows containers. + // This is just reserved for future use. + // WindowsContainerUser windows = 2; +} + + +message LinuxContainerUser { + // uid is the primary uid of the container process + Int64Value uid = 1; + // gid is the primary gid of the container process + Int64Value gid = 2; + // supplemental_groups are the supplemental groups attached to the container process + repeated int64 supplemental_groups = 3; +} + +// message WindowsContainerUser { +// T.B.D. +// } +``` + +### Test Plan + + + +[ ] I/we understand the owners of the involved components may require updates to +existing tests to make this code solid enough prior to committing the changes necessary +to implement this enhancement. + +##### Prerequisite testing updates + + + +##### Unit tests + + + + + +- ``: `` - `` + +##### Integration tests + + + +- : + +##### e2e tests + + + +- : + +### Graduation Criteria + + + +### Upgrade / Downgrade Strategy + + + +### Version Skew Strategy + + + +- CRI must support this feature, especially when using `SupplementalGroupsPolicy=IgnoreGroupsInImage`. +- kubelet must be at least the version of control-plane components. + +## Production Readiness Review Questionnaire + + + +### Feature Enablement and Rollback + + + +###### How can this feature be enabled / disabled in a live cluster? + + + +- [ ] Feature gate (also fill in values in `kep.yaml`) + - Feature gate name: + - Components depending on the feature gate: +- [ ] Other + - Describe the mechanism: + - Will enabling / disabling the feature require downtime of the control + plane? + - Will enabling / disabling the feature require downtime or reprovisioning + of a node? (Do not assume `Dynamic Kubelet Config` feature is enabled). + +###### Does enabling the feature change any default behavior? + + + +###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? + + + +###### What happens if we reenable the feature if it was previously rolled back? + +###### Are there any tests for feature enablement/disablement? + + + +### Rollout, Upgrade and Rollback Planning + + + +###### How can a rollout or rollback fail? Can it impact already running workloads? + + + +###### What specific metrics should inform a rollback? + + + +###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested? + + + +###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.? + + + +### Monitoring Requirements + + + +###### How can an operator determine if the feature is in use by workloads? + + + +###### How can someone using this feature know that it is working for their instance? + + + +- [ ] Events + - Event Reason: +- [ ] API .status + - Condition name: + - Other field: +- [ ] Other (treat as last resort) + - Details: + +###### What are the reasonable SLOs (Service Level Objectives) for the enhancement? + + + +###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service? + + + +- [ ] Metrics + - Metric name: + - [Optional] Aggregation method: + - Components exposing the metric: +- [ ] Other (treat as last resort) + - Details: + +###### Are there any missing metrics that would be useful to have to improve observability of this feature? + + + +### Dependencies + + + +###### Does this feature depend on any specific services running in the cluster? + + + +### Scalability + + + +###### Will enabling / using this feature result in any new API calls? + + + +###### Will enabling / using this feature result in introducing new API types? + + + +###### Will enabling / using this feature result in any new calls to the cloud provider? + + + +###### Will enabling / using this feature result in increasing size or count of the existing API objects? + + + +###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs? + + + +###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components? + + + +### Troubleshooting + + + +###### How does this feature react if the API server and/or etcd is unavailable? + +###### What are other known failure modes? + + + +###### What steps should be taken if SLOs are not being met to determine the problem? + +## Implementation History + + + +## Drawbacks + + + +## Alternatives + + + +### Introducing `RutimeClass` + +As described in the [Motivation](#motivation) section, cluster administrators would need to deploy a custom low-level container runtime(e.g., [pfnet-research/strict-supplementalgroups-container-runtime](https://github.com/pfnet-research/strict-supplementalgroups-container-runtime)) that modifies OCI container runtime spec(`config.json`) produced by CRI implementations (e.g., containerd, cri-o). A custom `RuntimeClass` would be introduced for it. + +### Adjusting container image by users + +Users could modify their container images to control the supplemental groups (i.e., modifying group memberships of the uid of the container). Although it is more work and users won't always have the option to do that. + +### Just fixing CRI implementations + +We could just fix CRI implementations directly without introducing new APIs. The advantage is no API changes both on Kubernetes and CRI levels. However, the main downside of this approach is a breaking change that makes users confused. + +## Infrastructure Needed (Optional) + + + +N/A \ No newline at end of file diff --git a/keps/sig-node/3619-supplemental-groups-policy/kep.yaml b/keps/sig-node/3619-supplemental-groups-policy/kep.yaml new file mode 100644 index 00000000000..d77a08d9dda --- /dev/null +++ b/keps/sig-node/3619-supplemental-groups-policy/kep.yaml @@ -0,0 +1,44 @@ +title: "Fine grained SupplementalGroups control" +kep-number: 3619 +authors: + - "@everpeace" +owning-sig: sig-xyz +participating-sigs: + - sig-node +status: provisional +creation-date: 2022-10-14 +reviewers: + - "@thockin" + - "@mrunalp" + - "@SergeyKanzhelev" +approvers: + - "@mrunalp" + +see-also: [] +replaces: [] + +# The target maturity stage in the current dev cycle for this KEP. +stage: alpha + +# The most recent milestone for which work toward delivery of this KEP has been +# done. This can be the current (upcoming) milestone, if it is being actively +# worked on. +latest-milestone: "v1.27" + +# The milestone at which this feature was, or is targeted to be, at each stage. +milestone: + alpha: "v1.27" + beta: "v1.xx" + stable: "v1.yy" + +# The following PRR answers are required at alpha release +# List the feature gate name and the components for which it must be enabled +feature-gates: + - name: SupplementalGroupsPolicy + components: + - kube-apiserver + - kubelet +disable-supported: true + +# The following PRR answers are required at beta release +metrics: [] From b598ea516881b0b101c9e2009478d839feb1ba50 Mon Sep 17 00:00:00 2001 From: Shingo Omura Date: Thu, 2 Feb 2023 17:24:04 +0900 Subject: [PATCH 2/4] add PRR approval request file and filled out PRR questionnaire just for alpha relevant sections. --- keps/prod-readiness/sig-node/3619.yaml | 6 +++++ .../3619-supplemental-groups-policy/README.md | 25 ++++++++++++++++--- 2 files changed, 28 insertions(+), 3 deletions(-) create mode 100644 keps/prod-readiness/sig-node/3619.yaml diff --git a/keps/prod-readiness/sig-node/3619.yaml b/keps/prod-readiness/sig-node/3619.yaml new file mode 100644 index 00000000000..5498ea6c31b --- /dev/null +++ b/keps/prod-readiness/sig-node/3619.yaml @@ -0,0 +1,6 @@ +# The KEP must have an approver from the +# "prod-readiness-approvers" group +# of http://git.k8s.io/enhancements/OWNERS_ALIASES +kep-number: 3619 +alpha: + approver: "@johnbelamaric" diff --git a/keps/sig-node/3619-supplemental-groups-policy/README.md b/keps/sig-node/3619-supplemental-groups-policy/README.md index 55e02dbc7a6..53bd74fece7 100644 --- a/keps/sig-node/3619-supplemental-groups-policy/README.md +++ b/keps/sig-node/3619-supplemental-groups-policy/README.md @@ -671,9 +671,9 @@ well as the [existing list] of feature gates. [existing list]: https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/ --> -- [ ] Feature gate (also fill in values in `kep.yaml`) - - Feature gate name: - - Components depending on the feature gate: +- [x] Feature gate (also fill in values in `kep.yaml`) + - Feature gate name: SupplementalGroupsPolicy + - Components depending on the feature gate: kube-apiserver, kubelet, (and CRI implementations(e.g. containerd, cri-o)) - [ ] Other - Describe the mechanism: - Will enabling / disabling the feature require downtime of the control @@ -687,6 +687,7 @@ well as the [existing list] of feature gates. Any change of default behavior may be surprising to users or break existing automations, so be extremely careful here. --> +No. Just introducing new API fields in Pod spec and CRI which does NOT change the default behavior. ###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? @@ -701,8 +702,12 @@ feature. NOTE: Also set `disable-supported` to `true` or `false` in `kep.yaml`. --> +Yes. It can be disabled after enabled. However, users should pay attention that gids of container processes in pods with `IgnoreGroupsInImage` policy would change. It means the action might break the application in permission. We plan to provide a way for users to detect which pods are affected. + ###### What happens if we reenable the feature if it was previously rolled back? +Just the policy `IgnoreGroupsInImage` is reenabled. Users should pay attention that gids of containers in pods with `IgnoreGroupsInImage` policy would change. It means that the action might break the application in permission. We plan to provide a way for users to detect which pods are affected. + ###### Are there any tests for feature enablement/disablement? +Planned for Alpha. + ### Rollout, Upgrade and Rollback Planning +No. Just introducing new API fields in Pod spec and CRI which does NOT change the default behavior. + ###### Will enabling / using this feature result in introducing new API types? +No. + ###### Will enabling / using this feature result in any new calls to the cloud provider? +No. + ###### Will enabling / using this feature result in increasing size or count of the existing API objects? +No. + ###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs? +No. + ###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components? +No. + ### Troubleshooting