From 50089ac4f4a24c9b9c21d209f2647e95f66f3483 Mon Sep 17 00:00:00 2001 From: Tim Allclair Date: Mon, 9 Jan 2023 17:52:47 -0800 Subject: [PATCH 01/13] KEP 3716: Webhook Predicates --- .../3716-webhook-predicates/README.md | 857 ++++++++++++++++++ .../3716-webhook-predicates/kep.yaml | 51 ++ 2 files changed, 908 insertions(+) create mode 100644 keps/sig-api-machinery/3716-webhook-predicates/README.md create mode 100644 keps/sig-api-machinery/3716-webhook-predicates/kep.yaml diff --git a/keps/sig-api-machinery/3716-webhook-predicates/README.md b/keps/sig-api-machinery/3716-webhook-predicates/README.md new file mode 100644 index 00000000000..c0f1be3b554 --- /dev/null +++ b/keps/sig-api-machinery/3716-webhook-predicates/README.md @@ -0,0 +1,857 @@ +# KEP-3716: Webhook Predicates + + +- [Release Signoff Checklist](#release-signoff-checklist) +- [Summary](#summary) +- [Motivation](#motivation) + - [User Stories](#user-stories) + - [Exclude resources from a wildcard rule](#exclude-resources-from-a-wildcard-rule) + - [Exempt system users from security policy](#exempt-system-users-from-security-policy) + - [Scope an NFS access management webhook to Pods mounting NFS volumes](#scope-an-nfs-access-management-webhook-to-pods-mounting-nfs-volumes) + - [Goals](#goals) + - [Non-Goals](#non-goals) +- [Proposal](#proposal) + - [API](#api) + - [Risks and Mitigations](#risks-and-mitigations) + - [Security](#security) + - [Debugability](#debugability) + - [Performance](#performance) +- [Design Details](#design-details) + - [Test Plan](#test-plan) + - [Prerequisite testing updates](#prerequisite-testing-updates) + - [Unit tests](#unit-tests) + - [Integration tests](#integration-tests) + - [e2e tests](#e2e-tests) + - [Graduation Criteria](#graduation-criteria) + - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) + - [Version Skew Strategy](#version-skew-strategy) +- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire) + - [Feature Enablement and Rollback](#feature-enablement-and-rollback) + - [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning) + - [Monitoring Requirements](#monitoring-requirements) + - [Dependencies](#dependencies) + - [Scalability](#scalability) + - [Troubleshooting](#troubleshooting) +- [Implementation History](#implementation-history) +- [Drawbacks](#drawbacks) +- [Alternatives](#alternatives) + - [Resource Exclusions](#resource-exclusions) + - [CEL Admission Control](#cel-admission-control) + + +## Release Signoff Checklist + +Items marked with (R) are required *prior to targeting to a milestone / release*. + +- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) +- [ ] (R) KEP approvers have approved the KEP status as `implementable` +- [ ] (R) Design details are appropriately documented +- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors) + - [ ] e2e Tests for all Beta API Operations (endpoints) + - [ ] (R) Ensure GA e2e tests for meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) + - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free +- [ ] (R) Graduation criteria is in place + - [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) +- [ ] (R) Production readiness review completed +- [ ] (R) Production readiness review approved +- [ ] "Implementation History" section is up-to-date for milestone +- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] +- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes + + + +[kubernetes.io]: https://kubernetes.io/ +[kubernetes/enhancements]: https://git.k8s.io/enhancements +[kubernetes/kubernetes]: https://git.k8s.io/kubernetes +[kubernetes/website]: https://git.k8s.io/website + +## Summary + +This KEP proposes adding "predicate" expressions to admission webhooks, as an extension to the +existing `rules` to define the scope of a webhook. A `predicate` is a +[CEL](https://github.com/google/cel-spec) expression that must evaluate to true for the admission +request to be sent to the webhook. If a `predicate` evaluates to false, the webhook is skipped for +that request (implicitly allowed). + +<<[UNRESOLVED naming ]>> +I'm not satisfied with the name `predicates` for this feature. Other ideas considered: +- filters, prefilters +- match{Predicates, Filters, Criteria, Expressions, Constraints, ...} +<<[/UNRESOLVED]>> + +## Motivation + +**Reliability:** Admission webhooks continue to be an operational sore spot for many Kubernetes +users. Webhooks that target cluster critical resources put the admission controller backing the +webhook in the critical path of cluster stability. Even if tools like namespace scoping are used to +avoid circular-dependencies and exclude critical system resources, a webhook outage can still have a +major impact on cluster availability. This proposal aims to mitigate (but not eliminate) these +issues by allowing webhooks to be more narrowly scoped and targeted. + +**Performance:** Admission webhooks sit in the critical request path for write-requests. Validating +webhooks can be run in parallel, but Mutating webhooks must be run in serial (up to 2 times!). This +makes webhooks extremely latency sensitive, and even a webhook that doesn't do any work still needs +to pay the network round-trip cost. + +**Supportability:** For hosted or managed Kubernetes distributions, webhooks can be a problem when +they interfere with requests by managed components. The existing criteria for filtering out requests +are insufficient for many use cases, and aren't easily appended with provider rules. + +_What about [CEL for Admission Control](https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/3488-cel-admission-control)?_ +`ValidatingAdmissionPolicy` is an exciting new feature which we hope will greatly reduce the need +for admission webhooks, but it is intentionally not attempting to cover every possible use case. +This proposal aims to improve the situation for those webhooks that cannot be migrated. + +### User Stories + +#### Exclude resources from a wildcard rule + +> I want to enforce metadata policy through an admission webhook without adding latency & risk to +> high QPS system requests. + +Currently, if a webhook uses wildcard match rules, there is no way to filter out a subset of +resources or requests from matching the wildcard. If the webhook instead enumerates every resource +that should match, it must be kept up-to-date with every CRD that's added. + +With CEL predicates, the webhook could specify wildcard match rules, and add predicates to filter +out the desired resources: + +```yaml +rules: + # Match CREATE & UPDATE on all resources: + - operations: + - CREATE + - UPDATE + apiGroups: '*' + apiVersions: '*' + resources: '*' +predicates: + # Exclude leases from the webhook + - expression: '!(request.resource.group == "coordination.k8s.io" && resource.resource == "leases")' +``` + +#### Exempt system users from security policy + +> As a managed cluster provider, I want to prevent user webhooks from intercepting critical system +> requests. + +System _resources_ can currently be exempted through a namespace or label selector, but requests by +system components against non-system resources cannot be. For example, update pod status requests by +Kubelets cannot be excluded from user webhooks intercepting all pod requests. + +With `predicates`, a managed cluster could append system-exclusion rules to each webhook. For example: + +```yaml +predicates: + # Exclude node requests from the webhook + - expression: '!("system:nodes" in request.userInfo.groups)' +``` + +#### Scope an NFS access management webhook to Pods mounting NFS volumes + +> I want to narrowly scope my webhook to only the relevant requests, in order to reduce load on the +> webhook and reduce latency in irrelevant requests. + +Concrete example: + +> A NFS deployment uses an third-party access management system. I have an admission webhook that +> performs an access check for against the external system for pods that mount NFS volumes. Only +> pods with NFS volumes need to be checked. + +Currently, there is no way to achieve this. Many webhook implementations today start by checking +that the request is within scope, and return early if it's not. This adds latency and an additional +failure point to irrelevant requests. This example requires an external integration, and thus is not +a candidate for migration to CEL `ValidatingAdmissionPolicy`. + +With predicates, the predicate expressions can check whether the request object is in-scope for the +webhook: + +```yaml +rules: + - operations: ['CREATE'] + apiGroups: '' # core + apiVersions: '*' + resources: 'pods' +predicates: + # Only include pods with an NFS volume. + - expression: 'request.object.spec.volumes.exists(v, v.has(nfs))' +``` + +### Goals + +1. Provide a filtering mechanism for excluding requests from an admission webhook +2. Maintain consistency with `ValidatingAdmissionPolicy` + +### Non-Goals + +* Provide a mechanism to exclude requests from all webhooks. + +## Proposal + +### API + +Both `ValidatingWebhook` and `MutatingWebhook` (in `admissionregistration.k8s.io`) will be updated +with a new `Predicates` field: + +```go + +type ValidatingWebhook struct { + // ... + + // Predicates is a list of conditions on the AdmissionRequest ('request') that must be met for a + // request to be sent to this webhook. + // +optional + Predicates []Predicate `json:"predicates,omitempty"` +} + +type MutatingWebhook struct { + // ... + Predicates []Predicate `json:"predicates,omitempty"` +} + +// Predicate represents a condition which must by fulfilled for a request to be sent to a webhook. +type Predicate struct { + // Expression represents the expression which will be evaluated by CEL. + // ref: https://github.com/google/cel-spec + // CEL expressions have access to the contents of the AdmissionRequest, organized into CEL variables: + // + //'request' - Attributes of the admission request([ref](/pkg/apis/admission/types.go#AdmissionRequest)). + // + // The `apiVersion`, `kind`, `metadata.name` and `metadata.generateName` are always accessible from the root of the + // request.object. No other metadata properties are accessible. + // + // Only property names of the form `[a-zA-Z_.-/][a-zA-Z0-9_.-/]*` are accessible. + // Accessible property names are escaped according to the following rules when accessed in the expression: + // - '__' escapes to '__underscores__' + // - '.' escapes to '__dot__' + // - '-' escapes to '__dash__' + // - '/' escapes to '__slash__' + // - Property names that exactly match a CEL RESERVED keyword escape to '__{keyword}__'. The keywords are: + // "true", "false", "null", "in", "as", "break", "const", "continue", "else", "for", "function", "if", + // "import", "let", "loop", "package", "namespace", "return". + // Examples: + // - Expression accessing a property named "namespace": {"Expression": "object.__namespace__ > 0"} + // - Expression accessing a property named "x-p + +``` + +The predicate expression has access to the contents of the `AdmissionRequest` object (exposed as the +`request` variable), but is not given any additional information. Expressions requiring access to +additional information (such as a paramater object) must be performed in the webhook, and are out of +scope for this proposal. + +<<[UNRESOLVED api structure ]>> +Do we need the predicate struct? Or should we just inline it instead? That is: +```go +type ValidatingWebhook struct { + Predicates []string `json:"predicates,omitempty"` +} +``` +<<[/UNRESOLVED]>> + +### Risks and Mitigations + +#### Security + +**Risk: Attacker adds or changes a predicate to weaken an admission policy.** + +This is does not represent a new threat, as doing so would require update access to the admission +registration object, and with that permission an attacker could already disable the policy through +manipulating match rules, namespace selector, or object selector (or reroute the webhook entirely). + +**Risk: Logic error in predicate expression.** + +Currently the predicate conditions must be encoded in the webhook backend itself. Moving the logic +into a CEL expression does not materially increase the risk of a logic bug. + +#### Debugability + +We do not normally log, audit, or emit an event when a webhook is out-of-scope for a request, and +the same will _mostly_ be true for predicates. + +At [log level V(5)](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-instrumentation/logging.md#what-method-to-use), +we will emit a log when a request that would otherwise be in-scope for a webhook is excluded for a +non-matching predicate. + +Short of increasing log verbosity, the recommended debug strategy is to capture or reproduce a +relevant AdmissionRequest (for example, in a non-prod cluster disable all predicates and log the +requests from a webhook). Then, manually test the predicates against the request, and iterate as +necessary. + +#### Performance + +The CEL expression evaluation will leverage the same [Resource Constraints](https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/2876-crd-validation-expression-language#resource-constraints) +used by CEL CRD Validation & CEL Admission Control. All the predicates for a given webhook will +share the same resource budget. + +<<[UNRESOLVED resource constraints ]>> +_NON-BLOCKING for Alpha_ +Details TBD. +<<[/UNRESOLVED]>> + +## Design Details + + + +### Test Plan + + + +[ ] I/we understand the owners of the involved components may require updates to +existing tests to make this code solid enough prior to committing the changes necessary +to implement this enhancement. + +##### Prerequisite testing updates + + + +##### Unit tests + + + + + +- ``: `` - `` + +##### Integration tests + + + +- : + +##### e2e tests + + + +- : + +### Graduation Criteria + + + +### Upgrade / Downgrade Strategy + + + +### Version Skew Strategy + + + +## Production Readiness Review Questionnaire + + + +### Feature Enablement and Rollback + + + +###### How can this feature be enabled / disabled in a live cluster? + + + +- [ ] Feature gate (also fill in values in `kep.yaml`) + - Feature gate name: + - Components depending on the feature gate: +- [ ] Other + - Describe the mechanism: + - Will enabling / disabling the feature require downtime of the control + plane? + - Will enabling / disabling the feature require downtime or reprovisioning + of a node? (Do not assume `Dynamic Kubelet Config` feature is enabled). + +###### Does enabling the feature change any default behavior? + + + +###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? + + + +###### What happens if we reenable the feature if it was previously rolled back? + +###### Are there any tests for feature enablement/disablement? + + + +### Rollout, Upgrade and Rollback Planning + + + +###### How can a rollout or rollback fail? Can it impact already running workloads? + + + +###### What specific metrics should inform a rollback? + + + +###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested? + + + +###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.? + + + +### Monitoring Requirements + + + +###### How can an operator determine if the feature is in use by workloads? + + + +###### How can someone using this feature know that it is working for their instance? + + + +- [ ] Events + - Event Reason: +- [ ] API .status + - Condition name: + - Other field: +- [ ] Other (treat as last resort) + - Details: + +###### What are the reasonable SLOs (Service Level Objectives) for the enhancement? + + + +###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service? + + + +- [ ] Metrics + - Metric name: + - [Optional] Aggregation method: + - Components exposing the metric: +- [ ] Other (treat as last resort) + - Details: + +###### Are there any missing metrics that would be useful to have to improve observability of this feature? + + + +### Dependencies + + + +###### Does this feature depend on any specific services running in the cluster? + + + +### Scalability + + + +###### Will enabling / using this feature result in any new API calls? + + + +###### Will enabling / using this feature result in introducing new API types? + + + +###### Will enabling / using this feature result in any new calls to the cloud provider? + + + +###### Will enabling / using this feature result in increasing size or count of the existing API objects? + + + +###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs? + + + +###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components? + + + +### Troubleshooting + + + +###### How does this feature react if the API server and/or etcd is unavailable? + +###### What are other known failure modes? + + + +###### What steps should be taken if SLOs are not being met to determine the problem? + +## Implementation History + + + +## Drawbacks + + + +## Alternatives + +### Resource Exclusions + +https://github.com/kubernetes/enhancements/pull/3694 Proposes an alternative approach using a more +structured format for expressing resource exclusions. This approach may be more approachable to +users who are not comfortable writing CEL expressions, but it is significantly less powerful. +This would address [Exclude resources from a wildcard rule](#exclude-resources-from-a-wildcard-rule), +and could be extended with subject exclusions to address +[Exempt system users from security policy](#exempt-system-users-from-security-policy), +but would not be sufficient to address +[Scope an NFS access management webhook to Pods mounting NFS volumes](#scope-an-nfs-access-management-webhook-to-pods-mounting-nfs-volumes). + +These two approaches are not mutually exclusive. + +### CEL Admission Control + +[KEP-3488: CEL for Admission Control](/keps/sig-api-machinery/3488-cel-admission-control) adds the +ability for admission webhooks to be replaced entirely by CEL expressions, but this is not intended +to cover 100% of webhook use cases. For example, the user story described in +[Scope an NFS access management webhook to Pods mounting NFS volumes](#scope-an-nfs-access-management-webhook-to-pods-mounting-nfs-volumes) +requires integrating with a third-party system, and is not implementable through a CEL +ValidatingAdmissionPolicy. + +With a mutating CEL admission policy (not yet implemented), a combination of mutating & validating +policies could ensure that objects have a designated scoping label applied, which could be filtered +using the `ObjectSelector` on the webhook. However, such an approach adds a lot of overhead and +complexity beyond this proposal. diff --git a/keps/sig-api-machinery/3716-webhook-predicates/kep.yaml b/keps/sig-api-machinery/3716-webhook-predicates/kep.yaml new file mode 100644 index 00000000000..daa055f6006 --- /dev/null +++ b/keps/sig-api-machinery/3716-webhook-predicates/kep.yaml @@ -0,0 +1,51 @@ +title: KEP Template +kep-number: 3716 +authors: + - "@tallclair" +owning-sig: sig-api-machinery +participating-sigs: +status: provisional +creation-date: 2023-01-09 +reviewers: + - TBD +approvers: + - TBD + +##### WARNING !!! ###### +# prr-approvers has been moved to its own location +# You should create your own in keps/prod-readiness +# Please make a copy of keps/prod-readiness/template/nnnn.yaml +# to keps/prod-readiness/sig-xxxxx/00000.yaml (replace with kep number) +#prr-approvers: + +see-also: + - "/keps/sig-api-machinery/2876-crd-validation-expression-language" + - "/keps/sig-api-machinery/3488-cel-admission-control" + - https://github.com/kubernetes/enhancements/pull/3697 + - https://github.com/kubernetes/enhancements/pull/3694 + +# The target maturity stage in the current dev cycle for this KEP. +stage: alpha + +# The most recent milestone for which work toward delivery of this KEP has been +# done. This can be the current (upcoming) milestone, if it is being actively +# worked on. +latest-milestone: "v1.27" + +# The milestone at which this feature was, or is targeted to be, at each stage. +milestone: + alpha: "v1.27" + beta: "TBD" + stable: "TBD" + +# The following PRR answers are required at alpha release +# List the feature gate name and the components for which it must be enabled +feature-gates: + - name: WebhookPredicates + components: + - kube-apiserver +disable-supported: true + +# The following PRR answers are required at beta release +metrics: + - TBD From 2fa817b17ddb0e86b2ee874468cec2ab15319d8e Mon Sep 17 00:00:00 2001 From: Tim Allclair Date: Tue, 10 Jan 2023 10:44:57 -0800 Subject: [PATCH 02/13] Add sig-auth as participating SIG --- keps/sig-api-machinery/3716-webhook-predicates/kep.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/keps/sig-api-machinery/3716-webhook-predicates/kep.yaml b/keps/sig-api-machinery/3716-webhook-predicates/kep.yaml index daa055f6006..4660b0bd735 100644 --- a/keps/sig-api-machinery/3716-webhook-predicates/kep.yaml +++ b/keps/sig-api-machinery/3716-webhook-predicates/kep.yaml @@ -4,6 +4,7 @@ authors: - "@tallclair" owning-sig: sig-api-machinery participating-sigs: + - sig-auth status: provisional creation-date: 2023-01-09 reviewers: From c408bc87ef86818f61ccbaa0ac8d653d04ab44a4 Mon Sep 17 00:00:00 2001 From: Tim Allclair Date: Thu, 12 Jan 2023 08:55:29 -0800 Subject: [PATCH 03/13] Fix API definition --- .../3716-webhook-predicates/README.md | 23 ++++++++++--------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/keps/sig-api-machinery/3716-webhook-predicates/README.md b/keps/sig-api-machinery/3716-webhook-predicates/README.md index c0f1be3b554..1acb4ac9295 100644 --- a/keps/sig-api-machinery/3716-webhook-predicates/README.md +++ b/keps/sig-api-machinery/3716-webhook-predicates/README.md @@ -234,17 +234,18 @@ type Predicate struct { // Examples: // - Expression accessing a property named "namespace": {"Expression": "object.__namespace__ > 0"} // - Expression accessing a property named "x-p - + // - Expression accessing a property named "redact__d": {"Expression": "object.redact__underscores__d > 0"} + // + // Equality on arrays with list type of 'set' or 'map' ignores element order, i.e. [1, 2] == [2, 1]. + // Concatenation on arrays with x-kubernetes-list-type use the semantics of the list type: + // - 'set': `X + Y` performs a union where the array positions of all elements in `X` are preserved and + // non-intersecting elements in `Y` are appended, retaining their partial order. + // - 'map': `X + Y` performs a merge where the array positions of all keys in `X` are preserved but the values + // are overwritten by values in `Y` when the key sets of `X` and `Y` intersect. Elements in `Y` with + // non-intersecting keys are appended, retaining their partial order. + // Required. + Expression string `json:"expression"` +} ``` The predicate expression has access to the contents of the `AdmissionRequest` object (exposed as the From 45cd6fb265e9a7ce0115435bdc139e8157f3569f Mon Sep 17 00:00:00 2001 From: Tim Allclair Date: Wed, 18 Jan 2023 16:05:00 -0800 Subject: [PATCH 04/13] Address feedback --- .../3716-webhook-predicates/README.md | 49 ++++++++++++------- 1 file changed, 31 insertions(+), 18 deletions(-) diff --git a/keps/sig-api-machinery/3716-webhook-predicates/README.md b/keps/sig-api-machinery/3716-webhook-predicates/README.md index 1acb4ac9295..009ce0ab19f 100644 --- a/keps/sig-api-machinery/3716-webhook-predicates/README.md +++ b/keps/sig-api-machinery/3716-webhook-predicates/README.md @@ -1,4 +1,4 @@ -# KEP-3716: Webhook Predicates +# KEP-3716: Admission Webhook Predicates - [Release Signoff Checklist](#release-signoff-checklist) @@ -35,6 +35,7 @@ - [Implementation History](#implementation-history) - [Drawbacks](#drawbacks) - [Alternatives](#alternatives) + - [Exclusion Expressions](#exclusion-expressions) - [Resource Exclusions](#resource-exclusions) - [CEL Admission Control](#cel-admission-control) @@ -90,6 +91,12 @@ avoid circular-dependencies and exclude critical system resources, a webhook out major impact on cluster availability. This proposal aims to mitigate (but not eliminate) these issues by allowing webhooks to be more narrowly scoped and targeted. + +The same benefits also apply to bootstrapping a new cluster, or in a disaster-recovery scenario +where objects in a cluster need to be recreated from source code. Being able to define and deploy a +webhook before any workload Pods are defined means that admins don't need to bypass security +controls to fix things, and helps to ensure there's never a security gap when restoring service. + **Performance:** Admission webhooks sit in the critical request path for write-requests. Validating webhooks can be run in parallel, but Mutating webhooks must be run in serial (up to 2 times!). This makes webhooks extremely latency sensitive, and even a webhook that doesn't do any work still needs @@ -217,6 +224,8 @@ type Predicate struct { // ref: https://github.com/google/cel-spec // CEL expressions have access to the contents of the AdmissionRequest, organized into CEL variables: // + //'object' - The object from the incoming request. The value is null for DELETE requests. + //'oldObject' - The existing object. The value is null for CREATE requests. //'request' - Attributes of the admission request([ref](/pkg/apis/admission/types.go#AdmissionRequest)). // // The `apiVersion`, `kind`, `metadata.name` and `metadata.generateName` are always accessible from the root of the @@ -233,7 +242,7 @@ type Predicate struct { // "import", "let", "loop", "package", "namespace", "return". // Examples: // - Expression accessing a property named "namespace": {"Expression": "object.__namespace__ > 0"} - // - Expression accessing a property named "x-p + // - Expression accessing a property named "x-prop": {"Expression": "object.x__dash__prop > 0"} // - Expression accessing a property named "redact__d": {"Expression": "object.redact__underscores__d > 0"} // // Equality on arrays with list type of 'set' or 'map' ignores element order, i.e. [1, 2] == [2, 1]. @@ -253,15 +262,6 @@ The predicate expression has access to the contents of the `AdmissionRequest` ob additional information (such as a paramater object) must be performed in the webhook, and are out of scope for this proposal. -<<[UNRESOLVED api structure ]>> -Do we need the predicate struct? Or should we just inline it instead? That is: -```go -type ValidatingWebhook struct { - Predicates []string `json:"predicates,omitempty"` -} -``` -<<[/UNRESOLVED]>> - ### Risks and Mitigations #### Security @@ -830,15 +830,28 @@ Why should this KEP _not_ be implemented? ## Alternatives +### Exclusion Expressions + +The `predicate` expression could be inverted, so that requests that match are excluded rather than +included. In this case, we would probably also want to change from requiring all expressions to +match, to excluding the request if any match. + +Although this approach would simplify some usecases, such as +[excluding resources from a wildcard rule](#exclude-resources-from-a-wildcard-rule) or +[exempting system users from a security policy](#exempt-system-users-from-security-policy), it +means other expressions would become double-negatives, which generally goes against API design +best-practices. + ### Resource Exclusions -https://github.com/kubernetes/enhancements/pull/3694 Proposes an alternative approach using a more -structured format for expressing resource exclusions. This approach may be more approachable to -users who are not comfortable writing CEL expressions, but it is significantly less powerful. -This would address [Exclude resources from a wildcard rule](#exclude-resources-from-a-wildcard-rule), -and could be extended with subject exclusions to address -[Exempt system users from security policy](#exempt-system-users-from-security-policy), -but would not be sufficient to address +[KEP-3693](https://github.com/kubernetes/enhancements/issues/3693) Proposes an alternative approach +using a more structured format for expressing resource exclusions. This approach may be more +approachable to users who are not comfortable writing CEL expressions, but it is significantly less +powerful. This would address +[Exclude resources from a wildcard rule](#exclude-resources-from-a-wildcard-rule), +and could be extended with subject exclusions to +address [Exempt system users from security policy](#exempt-system-users-from-security-policy), but +would not be sufficient to address [Scope an NFS access management webhook to Pods mounting NFS volumes](#scope-an-nfs-access-management-webhook-to-pods-mounting-nfs-volumes). These two approaches are not mutually exclusive. From b501dc4c547078f4b580ab59c1ebc31071a0d551 Mon Sep 17 00:00:00 2001 From: Tim Allclair Date: Wed, 18 Jan 2023 16:11:05 -0800 Subject: [PATCH 05/13] Rename predicates to matchConditions --- .../3716-webhook-predicates/README.md | 62 +++++++++---------- 1 file changed, 28 insertions(+), 34 deletions(-) diff --git a/keps/sig-api-machinery/3716-webhook-predicates/README.md b/keps/sig-api-machinery/3716-webhook-predicates/README.md index 009ce0ab19f..8df199daa05 100644 --- a/keps/sig-api-machinery/3716-webhook-predicates/README.md +++ b/keps/sig-api-machinery/3716-webhook-predicates/README.md @@ -1,4 +1,4 @@ -# KEP-3716: Admission Webhook Predicates +# KEP-3716: Admission Webhook Match Conditions - [Release Signoff Checklist](#release-signoff-checklist) @@ -70,18 +70,12 @@ Items marked with (R) are required *prior to targeting to a milestone / release* ## Summary -This KEP proposes adding "predicate" expressions to admission webhooks, as an extension to the -existing `rules` to define the scope of a webhook. A `predicate` is a +This KEP proposes adding "match conditions" to admission webhooks, as an extension to the +existing `rules` to define the scope of a webhook. A `matchCondition` is a [CEL](https://github.com/google/cel-spec) expression that must evaluate to true for the admission -request to be sent to the webhook. If a `predicate` evaluates to false, the webhook is skipped for +request to be sent to the webhook. If a `matchCondition` evaluates to false, the webhook is skipped for that request (implicitly allowed). -<<[UNRESOLVED naming ]>> -I'm not satisfied with the name `predicates` for this feature. Other ideas considered: -- filters, prefilters -- match{Predicates, Filters, Criteria, Expressions, Constraints, ...} -<<[/UNRESOLVED]>> - ## Motivation **Reliability:** Admission webhooks continue to be an operational sore spot for many Kubernetes @@ -122,8 +116,8 @@ Currently, if a webhook uses wildcard match rules, there is no way to filter out resources or requests from matching the wildcard. If the webhook instead enumerates every resource that should match, it must be kept up-to-date with every CRD that's added. -With CEL predicates, the webhook could specify wildcard match rules, and add predicates to filter -out the desired resources: +With CEL match conditions, the webhook could specify wildcard match rules, and add match conditions +to filter out the desired resources: ```yaml rules: @@ -134,7 +128,7 @@ rules: apiGroups: '*' apiVersions: '*' resources: '*' -predicates: +matchConditions: # Exclude leases from the webhook - expression: '!(request.resource.group == "coordination.k8s.io" && resource.resource == "leases")' ``` @@ -148,10 +142,10 @@ System _resources_ can currently be exempted through a namespace or label select system components against non-system resources cannot be. For example, update pod status requests by Kubelets cannot be excluded from user webhooks intercepting all pod requests. -With `predicates`, a managed cluster could append system-exclusion rules to each webhook. For example: +With `matchConditions`, a managed cluster could append system-exclusion rules to each webhook. For example: ```yaml -predicates: +matchConditions: # Exclude node requests from the webhook - expression: '!("system:nodes" in request.userInfo.groups)' ``` @@ -172,7 +166,7 @@ that the request is within scope, and return early if it's not. This adds latenc failure point to irrelevant requests. This example requires an external integration, and thus is not a candidate for migration to CEL `ValidatingAdmissionPolicy`. -With predicates, the predicate expressions can check whether the request object is in-scope for the +With match conditions, the expressions can check whether the request object is in-scope for the webhook: ```yaml @@ -181,7 +175,7 @@ rules: apiGroups: '' # core apiVersions: '*' resources: 'pods' -predicates: +matchConditions: # Only include pods with an NFS volume. - expression: 'request.object.spec.volumes.exists(v, v.has(nfs))' ``` @@ -200,26 +194,26 @@ predicates: ### API Both `ValidatingWebhook` and `MutatingWebhook` (in `admissionregistration.k8s.io`) will be updated -with a new `Predicates` field: +with a new `MatchConditions` field: ```go type ValidatingWebhook struct { // ... - // Predicates is a list of conditions on the AdmissionRequest ('request') that must be met for a + // MatchConditions is a list of conditions on the AdmissionRequest ('request') that must be met for a // request to be sent to this webhook. // +optional - Predicates []Predicate `json:"predicates,omitempty"` + MatchConditions []MatchCondition `json:"matchConditions,omitempty"` } type MutatingWebhook struct { // ... - Predicates []Predicate `json:"predicates,omitempty"` + MatchConditions []MatchCondition `json:"matchConditions,omitempty"` } -// Predicate represents a condition which must by fulfilled for a request to be sent to a webhook. -type Predicate struct { +// MatchCondition represents a condition which must by fulfilled for a request to be sent to a webhook. +type MatchCondition struct { // Expression represents the expression which will be evaluated by CEL. // ref: https://github.com/google/cel-spec // CEL expressions have access to the contents of the AdmissionRequest, organized into CEL variables: @@ -257,7 +251,7 @@ type Predicate struct { } ``` -The predicate expression has access to the contents of the `AdmissionRequest` object (exposed as the +The match condition expression has access to the contents of the `AdmissionRequest` object (exposed as the `request` variable), but is not given any additional information. Expressions requiring access to additional information (such as a paramater object) must be performed in the webhook, and are out of scope for this proposal. @@ -266,35 +260,35 @@ scope for this proposal. #### Security -**Risk: Attacker adds or changes a predicate to weaken an admission policy.** +**Risk: Attacker adds or changes a match condition to weaken an admission policy.** This is does not represent a new threat, as doing so would require update access to the admission registration object, and with that permission an attacker could already disable the policy through manipulating match rules, namespace selector, or object selector (or reroute the webhook entirely). -**Risk: Logic error in predicate expression.** +**Risk: Logic error in match condition expression.** -Currently the predicate conditions must be encoded in the webhook backend itself. Moving the logic +Currently the match conditions must be encoded in the webhook backend itself. Moving the logic into a CEL expression does not materially increase the risk of a logic bug. #### Debugability We do not normally log, audit, or emit an event when a webhook is out-of-scope for a request, and -the same will _mostly_ be true for predicates. +the same will _mostly_ be true for match conditions. At [log level V(5)](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-instrumentation/logging.md#what-method-to-use), we will emit a log when a request that would otherwise be in-scope for a webhook is excluded for a -non-matching predicate. +non-matching match condition. Short of increasing log verbosity, the recommended debug strategy is to capture or reproduce a -relevant AdmissionRequest (for example, in a non-prod cluster disable all predicates and log the -requests from a webhook). Then, manually test the predicates against the request, and iterate as -necessary. +relevant AdmissionRequest (for example, in a non-prod cluster disable all match conditions and log +the requests from a webhook). Then, manually test the match conditions against the request, and +iterate as necessary. #### Performance The CEL expression evaluation will leverage the same [Resource Constraints](https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/2876-crd-validation-expression-language#resource-constraints) -used by CEL CRD Validation & CEL Admission Control. All the predicates for a given webhook will +used by CEL CRD Validation & CEL Admission Control. All the match conditions for a given webhook will share the same resource budget. <<[UNRESOLVED resource constraints ]>> @@ -832,7 +826,7 @@ Why should this KEP _not_ be implemented? ### Exclusion Expressions -The `predicate` expression could be inverted, so that requests that match are excluded rather than +The `matchCondition` expression could be inverted, so that requests that match are excluded rather than included. In this case, we would probably also want to change from requiring all expressions to match, to excluding the request if any match. From 975b6c92d4b409f33e1f51b3dcea4d950eb63922 Mon Sep 17 00:00:00 2001 From: Tim Allclair Date: Wed, 18 Jan 2023 16:14:14 -0800 Subject: [PATCH 06/13] Update kep.yaml, assign reviewers & approvers --- .../sig-api-machinery/3716-webhook-predicates/kep.yaml | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/keps/sig-api-machinery/3716-webhook-predicates/kep.yaml b/keps/sig-api-machinery/3716-webhook-predicates/kep.yaml index 4660b0bd735..850b4e6edfe 100644 --- a/keps/sig-api-machinery/3716-webhook-predicates/kep.yaml +++ b/keps/sig-api-machinery/3716-webhook-predicates/kep.yaml @@ -1,4 +1,4 @@ -title: KEP Template +title: Admission Webhook Match Conditions kep-number: 3716 authors: - "@tallclair" @@ -8,9 +8,11 @@ participating-sigs: status: provisional creation-date: 2023-01-09 reviewers: - - TBD + - jpbetz + - cici37 + - maxsmythe approvers: - - TBD + - deads2k ##### WARNING !!! ###### # prr-approvers has been moved to its own location @@ -42,7 +44,7 @@ milestone: # The following PRR answers are required at alpha release # List the feature gate name and the components for which it must be enabled feature-gates: - - name: WebhookPredicates + - name: AdmissionWebhookMatchConditions components: - kube-apiserver disable-supported: true From 9a837c42ee959b38bcebb0fd10f1c2828ee914b2 Mon Sep 17 00:00:00 2001 From: Tim Allclair Date: Wed, 18 Jan 2023 16:20:12 -0800 Subject: [PATCH 07/13] Clarify expression API --- keps/sig-api-machinery/3716-webhook-predicates/README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/keps/sig-api-machinery/3716-webhook-predicates/README.md b/keps/sig-api-machinery/3716-webhook-predicates/README.md index 8df199daa05..f78ec4e96c2 100644 --- a/keps/sig-api-machinery/3716-webhook-predicates/README.md +++ b/keps/sig-api-machinery/3716-webhook-predicates/README.md @@ -251,10 +251,10 @@ type MatchCondition struct { } ``` -The match condition expression has access to the contents of the `AdmissionRequest` object (exposed as the -`request` variable), but is not given any additional information. Expressions requiring access to -additional information (such as a paramater object) must be performed in the webhook, and are out of -scope for this proposal. +The match condition expression is evaluated by the same libraries as those used for CEL +ValidatingAdmissionPolicy. The only difference in expressions is the availability of the `params` +variable. Expressions requiring access to additional information outside the AdmissionRequest must +be performed in the webhook, and are out of scope for this proposal. ### Risks and Mitigations From 5c45cc3e3da4a2c7f512be60721d7ead4ebb53e2 Mon Sep 17 00:00:00 2001 From: Tim Allclair Date: Thu, 19 Jan 2023 11:23:54 -0800 Subject: [PATCH 08/13] Add jpbetz as approver --- keps/sig-api-machinery/3716-webhook-predicates/README.md | 2 +- keps/sig-api-machinery/3716-webhook-predicates/kep.yaml | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/keps/sig-api-machinery/3716-webhook-predicates/README.md b/keps/sig-api-machinery/3716-webhook-predicates/README.md index f78ec4e96c2..6c60d3fd30d 100644 --- a/keps/sig-api-machinery/3716-webhook-predicates/README.md +++ b/keps/sig-api-machinery/3716-webhook-predicates/README.md @@ -214,7 +214,7 @@ type MutatingWebhook struct { // MatchCondition represents a condition which must by fulfilled for a request to be sent to a webhook. type MatchCondition struct { - // Expression represents the expression which will be evaluated by CEL. + // Expression represents the expression which will be evaluated by CEL. // ref: https://github.com/google/cel-spec // CEL expressions have access to the contents of the AdmissionRequest, organized into CEL variables: // diff --git a/keps/sig-api-machinery/3716-webhook-predicates/kep.yaml b/keps/sig-api-machinery/3716-webhook-predicates/kep.yaml index 850b4e6edfe..c8acaf4aa85 100644 --- a/keps/sig-api-machinery/3716-webhook-predicates/kep.yaml +++ b/keps/sig-api-machinery/3716-webhook-predicates/kep.yaml @@ -12,6 +12,7 @@ reviewers: - cici37 - maxsmythe approvers: + - jpbetz - deads2k ##### WARNING !!! ###### From ce389a54afa98fe91629a4f1982162d5c252cec1 Mon Sep 17 00:00:00 2001 From: Tim Allclair Date: Thu, 26 Jan 2023 17:13:24 -0800 Subject: [PATCH 09/13] Add user story for secondary-authz conditions --- .../3716-webhook-predicates/README.md | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/keps/sig-api-machinery/3716-webhook-predicates/README.md b/keps/sig-api-machinery/3716-webhook-predicates/README.md index 6c60d3fd30d..3a0c92de0fd 100644 --- a/keps/sig-api-machinery/3716-webhook-predicates/README.md +++ b/keps/sig-api-machinery/3716-webhook-predicates/README.md @@ -150,6 +150,20 @@ matchConditions: - expression: '!("system:nodes" in request.userInfo.groups)' ``` +Since the expression will be evaluated using a common Kubernetes CEL library, these expressions +should also get automatic access to the secondary authorization check mechanism described in +[KEP-3488: CEL for Admission Control](/keps/sig-api-machinery/3488-cel-admission-control#secondary-authz). +In practice, this means that RBAC bindings can be used to opt-out privileged users from security policy: + +_Note: The secondary authz mechanism has not yet been designed, and the example syntax here is just +to illustrate how it might be used._ + +```yaml +matchConditions: + # Exclude users with the 'breakglass' permission on the 'security-policy' webhook. + - expression: '!authorized(request.userInfo, "breakglass", "validatingwebhookconfigurations.admissionregistration.k8s.io/security-policy")' +``` + #### Scope an NFS access management webhook to Pods mounting NFS volumes > I want to narrowly scope my webhook to only the relevant requests, in order to reduce load on the @@ -223,7 +237,7 @@ type MatchCondition struct { //'request' - Attributes of the admission request([ref](/pkg/apis/admission/types.go#AdmissionRequest)). // // The `apiVersion`, `kind`, `metadata.name` and `metadata.generateName` are always accessible from the root of the - // request.object. No other metadata properties are accessible. + // object. No other metadata properties are accessible. // // Only property names of the form `[a-zA-Z_.-/][a-zA-Z0-9_.-/]*` are accessible. // Accessible property names are escaped according to the following rules when accessed in the expression: From 19321af7b159b835c7942c68015c5fd3e7afcbf9 Mon Sep 17 00:00:00 2001 From: Tim Allclair Date: Thu, 26 Jan 2023 17:31:51 -0800 Subject: [PATCH 10/13] Iterate on feedback --- .../3716-webhook-predicates/README.md | 50 ++++++++----------- 1 file changed, 20 insertions(+), 30 deletions(-) diff --git a/keps/sig-api-machinery/3716-webhook-predicates/README.md b/keps/sig-api-machinery/3716-webhook-predicates/README.md index 3a0c92de0fd..a433f466e49 100644 --- a/keps/sig-api-machinery/3716-webhook-predicates/README.md +++ b/keps/sig-api-machinery/3716-webhook-predicates/README.md @@ -191,7 +191,7 @@ rules: resources: 'pods' matchConditions: # Only include pods with an NFS volume. - - expression: 'request.object.spec.volumes.exists(v, v.has(nfs))' + - expression: 'object.spec.volumes.exists(v, v.has(nfs))' ``` ### Goals @@ -215,8 +215,9 @@ with a new `MatchConditions` field: type ValidatingWebhook struct { // ... - // MatchConditions is a list of conditions on the AdmissionRequest ('request') that must be met for a - // request to be sent to this webhook. + // MatchConditions is a list of conditions on the AdmissionRequest ('request') that must be met + // for a request to be sent to this webhook. All conditions in the list must evaluate to TRUE for + // the request to be matched. // +optional MatchConditions []MatchCondition `json:"matchConditions,omitempty"` } @@ -228,38 +229,16 @@ type MutatingWebhook struct { // MatchCondition represents a condition which must by fulfilled for a request to be sent to a webhook. type MatchCondition struct { + // NOTE: Placeholder documentation, to be replaced by https://github.com/kubernetes/website/issues/39089. + // // Expression represents the expression which will be evaluated by CEL. // ref: https://github.com/google/cel-spec // CEL expressions have access to the contents of the AdmissionRequest, organized into CEL variables: // - //'object' - The object from the incoming request. The value is null for DELETE requests. - //'oldObject' - The existing object. The value is null for CREATE requests. - //'request' - Attributes of the admission request([ref](/pkg/apis/admission/types.go#AdmissionRequest)). + // 'object' - The object from the incoming request. The value is null for DELETE requests. + // 'oldObject' - The existing object. The value is null for CREATE requests. + // 'request' - Attributes of the admission request([ref](/pkg/apis/admission/types.go#AdmissionRequest)). // - // The `apiVersion`, `kind`, `metadata.name` and `metadata.generateName` are always accessible from the root of the - // object. No other metadata properties are accessible. - // - // Only property names of the form `[a-zA-Z_.-/][a-zA-Z0-9_.-/]*` are accessible. - // Accessible property names are escaped according to the following rules when accessed in the expression: - // - '__' escapes to '__underscores__' - // - '.' escapes to '__dot__' - // - '-' escapes to '__dash__' - // - '/' escapes to '__slash__' - // - Property names that exactly match a CEL RESERVED keyword escape to '__{keyword}__'. The keywords are: - // "true", "false", "null", "in", "as", "break", "const", "continue", "else", "for", "function", "if", - // "import", "let", "loop", "package", "namespace", "return". - // Examples: - // - Expression accessing a property named "namespace": {"Expression": "object.__namespace__ > 0"} - // - Expression accessing a property named "x-prop": {"Expression": "object.x__dash__prop > 0"} - // - Expression accessing a property named "redact__d": {"Expression": "object.redact__underscores__d > 0"} - // - // Equality on arrays with list type of 'set' or 'map' ignores element order, i.e. [1, 2] == [2, 1]. - // Concatenation on arrays with x-kubernetes-list-type use the semantics of the list type: - // - 'set': `X + Y` performs a union where the array positions of all elements in `X` are preserved and - // non-intersecting elements in `Y` are appended, retaining their partial order. - // - 'map': `X + Y` performs a merge where the array positions of all keys in `X` are preserved but the values - // are overwritten by values in `Y` when the key sets of `X` and `Y` intersect. Elements in `Y` with - // non-intersecting keys are appended, retaining their partial order. // Required. Expression string `json:"expression"` } @@ -285,6 +264,17 @@ manipulating match rules, namespace selector, or object selector (or reroute the Currently the match conditions must be encoded in the webhook backend itself. Moving the logic into a CEL expression does not materially increase the risk of a logic bug. +Of particular significance are match conditions tied to non-functional properties of an object, such +as using labels to decide whether to opt an object out of a policy. Without additional admition +controls on who can set those non-functional aspects, exempting the policy based on that could be a +security vulnerability. In contrast, the +[NFS example usecase](#scope-an-nfs-access-management-webhook-to-pods-mounting-nfs-volumes) exempts +the policy on a _functional_ aspect - whether an NFS volume is mounted, and thus whether the policy +is relevant. + +These risks are inherent to the feature being proposed and cannot be mitigated through technical +means, but should be highlighted in the documentation. + #### Debugability We do not normally log, audit, or emit an event when a webhook is out-of-scope for a request, and From b5232768348a2337d30f2667b3a58cd3811c1edd Mon Sep 17 00:00:00 2001 From: Tim Allclair Date: Tue, 31 Jan 2023 15:36:58 -0800 Subject: [PATCH 11/13] Prod Readiness (alpha) --- .../sig-api-machinery/3716.yaml | 6 +++ .../3716-webhook-predicates/README.md | 47 ++++++------------- 2 files changed, 21 insertions(+), 32 deletions(-) create mode 100644 keps/prod-readiness/sig-api-machinery/3716.yaml diff --git a/keps/prod-readiness/sig-api-machinery/3716.yaml b/keps/prod-readiness/sig-api-machinery/3716.yaml new file mode 100644 index 00000000000..f9cb8aeb917 --- /dev/null +++ b/keps/prod-readiness/sig-api-machinery/3716.yaml @@ -0,0 +1,6 @@ +# The KEP must have an approver from the +# "prod-readiness-approvers" group +# of http://git.k8s.io/enhancements/OWNERS_ALIASES +kep-number: 3716 +alpha: + approver: "@deads2k" diff --git a/keps/sig-api-machinery/3716-webhook-predicates/README.md b/keps/sig-api-machinery/3716-webhook-predicates/README.md index a433f466e49..ac475c83938 100644 --- a/keps/sig-api-machinery/3716-webhook-predicates/README.md +++ b/keps/sig-api-machinery/3716-webhook-predicates/README.md @@ -507,47 +507,27 @@ This section must be completed when targeting alpha to a release. ###### How can this feature be enabled / disabled in a live cluster? - - -- [ ] Feature gate (also fill in values in `kep.yaml`) - - Feature gate name: - - Components depending on the feature gate: -- [ ] Other - - Describe the mechanism: - - Will enabling / disabling the feature require downtime of the control - plane? - - Will enabling / disabling the feature require downtime or reprovisioning - of a node? (Do not assume `Dynamic Kubelet Config` feature is enabled). +- [X] Feature gate (also fill in values in `kep.yaml`) + - Feature gate name: `AdmissionWebhookMatchConditions` + - Components depending on the feature gate: `kube-apiserver` ###### Does enabling the feature change any default behavior? - +No. If the feature is enabled, but the `matchConditions` field is unset, the default behavior +remains unchanged. ###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? - +Any `matchConditions` that were already stored on existing webhooks will be enforced. -###### What happens if we reenable the feature if it was previously rolled back? +Note: enabling `matchConditions` can only reduce the number of requests being sent to a webhook (or +remain unchanged). Enabling it will never increase the number of requests. ###### Are there any tests for feature enablement/disablement? @@ -564,6 +544,9 @@ You can take a look at one potential example of such test in: https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05ab52e3f5f02429e94b68ce6bce0dc534d1be636154fded3R246-R282 --> +[Registry tests](https://github.com/kubernetes/kubernetes/blob/c4ebbeeb747cd3e2b1d83733a14d367a65723a45/pkg/registry/core/pod/strategy_test.go) +will verify the drop disabled fields logic is correctly implemented. + ### Rollout, Upgrade and Rollback Planning +Downgrading in a way that disables match conditions after it is already in use can increase the +scope of requests evaluated by a webhook. See +[Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?](#can-the-feature-be-disabled-once-it-has-been-enabled-ie-can-we-roll-back-the-enablement) +for more details ### Version Skew Strategy - +The new field is only evaluated by the apiserver, so only HA apiserver version skew is relevant. In +this case, if the feature is enabled in one apiserver and not another, a request could +non-deterministically be sent to a webhook. Enabling match conditions without setting +`matchConditions` on an webhooks is a no-op, so the version skew non-determinism is best avoided by +waiting until it has been enabled in all apiservers before starting to use the new field. ## Production Readiness Review Questionnaire @@ -590,6 +579,14 @@ Even if applying deprecation policies, they may still surprise some users. ### Monitoring Requirements +A new per-webhook metric will measure the number of requests excluded by match conditions: + +Metric name: `webhook_admission_match_condition_exclusions_total` +Labels: +- `name`: webhook name +- `type`: `validate` or `admit` +- `operation`: the admission operation + +## Future Work + +### Cross-webhook match conditions + +In the future, we should explore ways to apply common match conditions across multiple webhooks. + +Example use cases: +- Apply a [break-glass exemption](#exempt-system-users-from-security-policy) across many (or all) webhooks. +- Managed cluster provider wants to exempt provider-managed resources from user-managed webhooks. + +Considerations: +- Access by managed cluster provider vs. cluster admin +- Side effects & mutations + ## Alternatives ### Exclusion Expressions