Skip to content

Commit

Permalink
doc(KDP): add policy status KDP
Browse files Browse the repository at this point in the history
Signed-off-by: Khaled Emara <[email protected]>
  • Loading branch information
KhaledEmaraDev committed Dec 10, 2024
1 parent e00d1c9 commit 3edfd74
Showing 1 changed file with 170 additions and 0 deletions.
170 changes: 170 additions & 0 deletions proposals/policy_status.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
# Meta

[meta]: #meta

- Name: Kyverno Policy Status Readiness Evaluation
- Start Date: 2024-12-10
- Author(s): @KhaledEmaraDev

# Table of Contents

[table-of-contents]: #table-of-contents

- [Meta](#meta)
- [Table of Contents](#table-of-contents)
- [Overview](#overview)
- [Definitions](#definitions)
- [Motivation](#motivation)
- [Proposal](#proposal)
- [Implementation](#implementation)
- [Migration](#migration)
- [Drawbacks](#drawbacks)
- [Alternatives](#alternatives)
- [Unresolved Questions](#unresolved-questions)
- [CRD Changes](#crd-changes)

# Overview

[overview]: #overview

This KDP proposes a comprehensive status reporting mechanism for Kyverno Policies. The status will reflect the operational readiness of a policy, considering factors like webhook configuration, caching, RBAC permissions, and schema validation. This allows users to quickly identify and diagnose issues preventing their policies from functioning correctly.

# Definitions

[definitions]: #definitions

- **Policy**: Custom Resource Definition representing a Kyverno policy configuration
- **Webhook**: Kubernetes admission controller mechanism for intercepting and potentially modifying resource requests
- **Validating Webhook**: Kubernetes webhook that validates resource configurations
- **Mutating Webhook**: Kubernetes webhook that can modify resource configurations before admission
- **Policy Cache**: An internal Kyverno mechanism that stores policies for quick reference and processing
- **RBAC**: Role-Based Access Control, determining permissions for Kubernetes resources
- **Policy Status**: Current operational condition of a Kyverno policy (Ready, Partially Ready, Not Ready)
- **Schema Validation**: The process of verifying that a policy definition conforms to the expected schema

# Motivation

[motivation]: #motivation

- The current implementation of Kyverno policies can lead to confusion and frustration when policies are not applied as expected
- Provide clear, granular visibility into policy operational status
- Enable administrators to quickly understand policy deployment challenges
- Create a robust mechanism for tracking policy readiness across complex Kubernetes environments
- Support effective troubleshooting of policy configuration issues

# Proposal

The policy status determination will be based on four key evaluation criteria, each would have its own status consition:

1. **Webhook Configuration Validation**

- Policies define rules that are configured in either validating or mutating webhooks.
- If an error occurs during webhook configuration, the policy will be Not Ready if it solely relies on the failed webhook type.
- If the other webhook type is successfully configured, the policy can be marked Partially Ready.
- Policies are marked Ready only when all required webhooks are configured without error.

2. **Policy Caching Verification**

- A policy must exist in the Policy cache of at least the leader replica to be considered Ready.
- Missing from the cache indicates a Not Ready state.

3. **RBAC Permission Verification**

- Policies requiring permissions not granted to the Admission Controller will be Not Ready.
- Detailed feedback should be provided about the missing permissions.

4. **Schema Validation**

- If a policy fails schema validation, it is considered Not Ready.
- Specific validation errors must be logged to guide resolution.


# Implementation

## Detailed Readiness Evaluation

### 1. Webhook Configuration Validation

- Separately track validating and mutating webhook configurations
- Rules for status:
- Both webhooks configured successfully: **Ready**
- One webhook fails configuration:
- If policy requires only that webhook type: **Not Ready**
- If policy can function with alternative webhook: **Partially Ready**

### 2. Policy Caching

- Check cache presence on:
- Leader node (mandatory)
- Optional: All replica nodes
- Failure to cache on leader results in **Not Ready** status

### 3. RBAC Permission Verification

- Dynamically inspect required vs. available permissions
- Insufficient permissions trigger **Not Ready** status
- Comprehensive permission mapping required

### 4. Schema Validation

- Perform exhaustive schema validation during policy admission
- Any schema validation error results in **Not Ready** status
- Provide detailed error messages for troubleshooting

## Proposed Status Transition Matrix

| Condition | Validating WH | Mutating WH | Cache | RBAC | Schema | Status |
| ------------------------- | ------------- | ----------- | ----- | ---- | ------ | ----------------- |
| All Conditions Successful |||||| Ready |
| Validating WH Fails |||||| Not Ready |
| Mutating WH Fails |||||| Not Ready/Partially Ready |
| Cache Missing |||||| Not Ready |
| RBAC Insufficient |||||| Not Ready |
| Schema Invalid |||||| Not Ready |

# Migration

Any automation code that tracks the Policy readiness would have to look at four different conditions to determine the readiness of the policy.

# Drawbacks

- Increased complexity in policy status tracking
- Potential performance overhead from comprehensive validation

# Alternatives

- Simplified status tracking with fewer criteria
- Binary (Ready/Not Ready) instead of three-state status
- Consolidated status tracking for all validation dimension

# Unresolved Questions

- Should additional conditions (e.g., external dependency checks) be included in readiness evaluation?
- What is the acceptable delay for updating readiness status under high load?

# CRD Changes

Updates to Policy CRD to include:

- New `status.conditions` fields for tracking webhook, cache, RBAC, and schema validation statuses

```yaml
status:
conditions:
- type: WebhookConfigured
status: True|False
reason: <Reason>
message: <Detailed message>
- type: CachePresence
status: True|False
reason: <Reason>
message: <Detailed message>
- type: RBACPermission
status: True|False
reason: <Reason>
message: <Detailed message>
- type: SchemaValid
status: True|False
reason: <Reason>
message: <Detailed message>
```

0 comments on commit 3edfd74

Please sign in to comment.