Skip to content

Commit 65c2cba

Browse files
committed
CNF-13731: Cert Manager HTTP01 Proxy
1 parent 995b620 commit 65c2cba

File tree

2 files changed

+107
-0
lines changed

2 files changed

+107
-0
lines changed
Loading
Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
---
2+
title: http01-challenge-cert-manager-proxy
3+
authors:
4+
- "@sebrandon1"
5+
reviewers:
6+
- "@TrilokGeer"
7+
- "@swagosh"
8+
approvers:
9+
- "@tkashem"
10+
- "@deads2k"
11+
- "@derekwaynecarr"
12+
api-approvers:
13+
- "@JoelSpeed"
14+
creation-date: 2025-03-28
15+
last-updated: 2025-03-28
16+
status: implementable
17+
tracking-link:
18+
- https://issues.redhat.com/browse/CNF-13731
19+
---
20+
21+
# HTTP01 Challenge Proxy for Cert Manager
22+
23+
![HTTP01 Challenge Proxy Diagram](http01_challenge.png)
24+
25+
## Summary
26+
27+
For baremetal platforms only. Provide a way for cert-manager to manage certifications for API endpoints (such as api.cluster.example.com) similar to the way it handles certificates for other OpenShift Ingress endpoints.
28+
29+
## Motivation
30+
31+
Cert manager can be used to issue certificates for the OpenShift Container Platform (OCP) endpoints (e.g., console, downloads, oauth) using an external ACME Certificate Authority (CA). These endpoints are exposed via the OpenShift Ingress (`*.apps.cluster.example.com`), and this is a supported and functional configuration today.
32+
33+
However, cluster administrators often want to issue custom certificates for the API endpoint (`api.cluster.example.com`). Unlike other endpoints, this API endpoint is not exposed via the OpenShift Ingress. Depending on the OCP topology (e.g., SNO, MNO, Compact), it is exposed directly on the node or via a keepalive VIP. This lack of management by the OpenShift Ingress introduces challenges in obtaining certificates using an external ACME CA.
34+
35+
The gap arises due to how the ACME HTTP01 challenge works. The following scenarios illustrate the challenges:
36+
37+
1. **SNO (Single Node OpenShift)**: The same nodes host both the ingress and API components. Both FQDNs (`api` and wildcard) resolve to the same IP, making the challenge feasible.
38+
2. **Compact Clusters**: The node hosting the API VIP may also host an OpenShift Router. If no router is present on the node hosting the VIP, the challenge will fail.
39+
40+
To address this gap, a small proxy was developed. This proxy runs on the cluster as a DaemonSet and ensures that connections reaching the API on port 80 are redirected to the OpenShift Ingress Routers. The proxy implementation uses `nftables` to redirect traffic from `API:80` to `PROXY:8888`.
41+
42+
- **Proxy Code**: [GitHub Repository](https://github.com/mvazquezc/cert-mgr-http01-proxy/tree/main)
43+
- **Deployment Manifest**: [Manifest Link](https://github.com/mvazquezc/cert-mgr-http01-proxy/blob/main/manifests/deploy-in-ocp.yaml)
44+
45+
This enhancement aims to provide a robust solution for managing certificates for the API endpoint in baremetal environments.
46+
47+
### User Stories
48+
49+
1. **As a cluster administrator**, I want to issue custom certificates for the API endpoint (`api.cluster.example.com`) using an external ACME CA, so that I can ensure secure communication for my cluster's API.
50+
2. **As a cluster administrator on a baremetal platform**, I want a reliable solution to handle HTTP01 challenges for the API endpoint, even when the endpoint is not managed by OpenShift Ingress, so that I can avoid manual workarounds.
51+
3. **As a developer**, I want a simple deployment mechanism for the HTTP01 challenge proxy, so that I can easily integrate it into my existing cluster setup.
52+
53+
### Proposal
54+
55+
The HTTP01 Challenge Proxy will be implemented via DaemonSet running on the cluster. It will:
56+
57+
- Redirect HTTP traffic from the API endpoint (`api.cluster.example.com`) on port 80 to the OpenShift Ingress Routers.
58+
- Use `nftables` for traffic redirection from `API:80` to `PROXY:8888`.
59+
- Be deployed using a manifest that includes all necessary configurations.
60+
61+
The proxy will ensure compatibility with various OCP topologies, including SNO, MNO, and Compact clusters, addressing the challenges of HTTP01 validation for the API endpoint.
62+
63+
### API Extensions
64+
65+
A new CR type may be created and can be applied to clusters. This new typed will be stored in the [openshift/api](https://github.com/openshift/api) repo.
66+
67+
### Implementation Details/Notes/Constraints
68+
69+
- The proxy will be deployed as a DaemonSet to ensure it runs on all nodes in the cluster.
70+
- The nftables rules will be added to the nodes. The proxy will listen on port 8888 and redirect traffic to the OpenShift Ingress Routers.
71+
- The implementation relies on `nftables` for traffic redirection, which must be supported and enabled on the cluster nodes.
72+
- The demo deployment manifest for the proxy is available [here](https://github.com/mvazquezc/cert-mgr-http01-proxy/blob/main/manifests/deploy-in-ocp.yaml).
73+
- An example implementation can be found in this [repository](https://github.com/mvazquezc/cert-mgr-http01-proxy/tree/main).
74+
75+
### Design Details
76+
77+
- **Proxy Deployment**: The proxy will be deployed using a Kubernetes DaemonSet. The daemonset will implement an nftable rule via pod that runs to completion.
78+
- **Traffic Redirection**: This will use `nftables` rules to redirect incoming traffic on `API:80` to `PROXY:8888`.
79+
- **Security**: The proxy will only handle HTTP traffic for the HTTP01 challenge and will not interfere with other traffic or services.
80+
- **Monitoring**: Logs and metrics will be exposed to help administrators monitor the proxy's behavior and troubleshoot issues.
81+
82+
### Drawbacks
83+
84+
1. **Dependency on nftables**: The solution relies on `nftables`, which may not be available or enabled on all environments.
85+
2. **Additional Resource Usage**: Running the proxy as a DaemonSet introduces additional resource usage on the cluster nodes while the proxy pod is applying its nftable rules.
86+
3. **Complexity**: The solution adds another component to the cluster, which may increase operational complexity.
87+
88+
### Alternatives
89+
90+
None
91+
92+
### Risks and Mitigations
93+
94+
1. **Proxy Failure**: If the proxy fails, HTTP01 challenges for the API endpoint will not succeed. Mitigation: Use health checks and monitoring to ensure the proxy is running correctly.
95+
2. **Traffic Interference**: The proxy could inadvertently interfere with other traffic. Mitigation: Carefully scope the proxy's functionality to only handle HTTP01 challenge traffic.
96+
97+
### Implementation History
98+
99+
- **2025-03-28**: Enhancement proposal created.
100+
101+
### References
102+
103+
- [Cert Manager Expansion JIRA Epic](https://issues.redhat.com/browse/CNF-13731)
104+
- [ACME HTTP01 Challenge](https://letsencrypt.org/docs/challenge-types/#http-01-challenge)
105+
- [Proxy Code Repository](https://github.com/mvazquezc/cert-mgr-http01-proxy/tree/main)
106+
- [Deployment Manifest](https://github.com/mvazquezc/cert-mgr-http01-proxy/blob/main/manifests/deploy-in-ocp.yaml)
107+

0 commit comments

Comments
 (0)