Skip to content

Commit

Permalink
doc(KDP): add policy status KDP
Browse files Browse the repository at this point in the history
Signed-off-by: Khaled Emara <[email protected]>
  • Loading branch information
KhaledEmaraDev committed Dec 17, 2024
1 parent e00d1c9 commit a00a3e2
Showing 1 changed file with 136 additions and 0 deletions.
136 changes: 136 additions & 0 deletions proposals/policy_status.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
**Meta**

- **Name:** Kyverno Policy Status Readiness Evaluation
- **Start Date:** 2024-12-10
- **Author(s):** @KhaledEmaraDev

**Table of Contents**

- [Meta](#meta)
- [Table of Contents](#table-of-contents)
- [Introduction](#introduction)
- [What's the Problem?](#problem)
- [Proposed Solution](#solution)
- [How It Works](#how-it-works)
- [Status Conditions: Single vs. Multiple](#status-conditions)
- [Option 1: Single Overall Status](#option1)
- [Option 2: Multiple Granular Statuses](#option2)
- [Benefits](#benefits)
- [What's Changed?](#changes)
- [Things to Consider](#consider)
- [Next Steps](#next-steps)
- [CRD Changes](#crd-changes)

**Introduction**

This document outlines a proposal to enhance the way Kyverno reports the operational status of your policies. Currently, it can be tricky to know if a policy is working correctly, leading to confusion and unexpected results. We're proposing a new system to provide clear, actionable information about a policy's readiness, empowering you to troubleshoot issues quickly.

**What's the Problem?**

When a Kyverno policy isn't working as expected, figuring out why can be difficult. Is the policy not loaded into memory? Are there problems with its configuration? Does the Kyverno service have the permissions it needs? Without clear status information, these issues can be hard to pinpoint and fix.

**Proposed Solution**

Our proposed solution provides detailed feedback about the different factors that affect policy readiness. This would be reflected in the policy's status, allowing users to quickly identify problems such as:

- **Policy Configuration:** Is the policy configured correctly for webhooks?
- **Policy Loading:** Has the policy been loaded into Kyverno's internal memory?
- **Permissions:** Does Kyverno have the necessary permissions to enforce the policy?
- **Policy Definition:** Is the policy definition valid according to Kyverno's rules?

This information would be available directly from the Policy's status field which can be checked by `kubectl describe policy <policy-name>`

**How It Works**

We will monitor the policy status by checking the following factors:

1. **Webhook Configuration:** Kyverno uses webhooks to intercept resource requests. We'll track if these are set up correctly.
2. **Policy Cache:** Kyverno stores policies in an internal cache for performance. We'll make sure the policy is present in the cache.
3. **Permissions (RBAC):** Kyverno requires specific permissions to function correctly. We'll check if those are granted for the policy.
4. **Schema Validation:** The policy's definition must be valid. We'll check for any structural or logical errors in its configuration.

**Status Conditions: Single vs. Multiple**

We propose two options for reporting the status of these factors. Let's break down each approach:

<a name="option1"></a>
**Option 1: Single Overall Status**

This option provides a single "Ready" status that reflects the overall health of a policy. If everything is working, the policy is considered "Ready." If any of the factors above (webhook configuration, cache, permissions, or schema validation) has a problem, the overall status will be reflected with a reason explaining the cause.

- **Pros:** Simpler for users to understand at a glance. Easy to use in automation scripts or tools that only need a basic yes/no.
- **Cons:** Less granular troubleshooting information. Users will need to examine the message to figure out where to investigate further.

<a name="option2"></a>
**Option 2: Multiple Granular Statuses**

This option provides a dedicated status for each factor, allowing users to pinpoint the exact cause of any problems:

- `WebhookConfigured`: True if the webhooks are configured as expected.
- `CachePresence`: True if the policy is present in Kyverno's cache.
- `RBACPermissions`: True if Kyverno has all required permissions to apply this policy.
- `SchemaValid`: True if the policy's definition is valid.

Each condition can be `True`, `False`, or `Unknown`. A message associated with each condition will give more details if the condition is not `True`.

- **Pros:** Highly detailed and allows for targeted troubleshooting. A user or automation tool can check `CachePresence` to just determine if the policy was loaded or `RBACPermissions` to determine if Kyverno has the proper permissions.
- **Cons:** More complex to interpret at a glance. Users will need to look at multiple statuses instead of one.

**We are recommending Option 2 because it gives better flexibility and allows the user to focus on only the conditions that are relevant to their needs.**

**Benefits**

- **Easy to Understand:** Clear status indicators show if policies are ready.
- **Faster Troubleshooting:** Specific error messages guide users to the root cause of problems.
- **Improved Reliability:** Reduces policy deployment failures by giving early warnings about issues.
- **Better Control:** Allows you to monitor the Kyverno service as a whole and identify operational errors.

**What's Changed?**

This proposal introduces new fields in the Policy status to track readiness and provides more detailed status messages. This means your automated tools that monitor the status would need to be updated to look at the new conditions as described above or the single status described in Option 1.

**Next Steps**

Before implementing this proposal, we need to discuss the following

- **Which status condition model to choose:** Should we use the single overall status or the more detailed multiple status approach?
- **Testing:** How will we test this functionality thoroughly?
- **Performance:** How will we make sure performance is not impacted?

**CRD Changes**

The Kyverno `Policy` CRD's status section will be updated to include the new status conditions using either a single status object or multiple status objects as described above.

**Option 1 - Single Status Condition:**

```yaml
status:
conditions:
- type: Ready
status: True|False
reason: <Reason>
message: <Detailed message>
```
**Option 2 - Multiple Status Conditions:**
```yaml
status:
conditions:
- type: WebhookConfigured
status: True|False|Unknown
reason: <Reason>
message: <Detailed message>
- type: CachePresence
status: True|False|Unknown
reason: <Reason>
message: <Detailed message>
- type: RBACPermissions
status: True|False|Unknown
reason: <Reason>
message: <Detailed message>
- type: SchemaValid
status: True|False|Unknown
reason: <Reason>
message: <Detailed message>
```

0 comments on commit a00a3e2

Please sign in to comment.