diff --git a/.chloggen/migration-operator-helm-certs.yaml b/.chloggen/migration-operator-helm-certs.yaml index e9937ca69..d49004b11 100644 --- a/.chloggen/migration-operator-helm-certs.yaml +++ b/.chloggen/migration-operator-helm-certs.yaml @@ -10,6 +10,6 @@ issues: [1648] # These lines will be padded with 2 spaces and then inserted directly into the document. # Use pipe (|) for multiline entries. subtext: | - - For users enabling both the operator and certmanager (.Values.operator.enabled=true, .Values.certmanager.enabled=true), please review the [upgrade guidelines](https://github.com/signalfx/splunk-otel-collector-chart/blob/main/UPGRADING.md#0119-to-0120). - - Previously, certificates were generated by certmanager by default; now they are generated by Helm unless specified otherwise. - - This change simplifies setup for new users while still supporting those who prefer certmanager. + - Previously, certificates were generated by cert-manager by default; now they are generated by Helm templates unless configured otherwise. + - This change simplifies the setup for new users while still supporting those who prefer using cert-manager or other solutions. For more details, see the [related documentation](https://github.com/signalfx/splunk-otel-collector-chart/tree/main/docs/auto-instrumentation-install.md#tls-certificate-requirement-for-kubernetes-operator-webhooks). + - If you use `.Values.operator.enabled=true` and `.Values.certmanager.enabled=true`, please review the [upgrade guidelines](https://github.com/signalfx/splunk-otel-collector-chart/blob/main/UPGRADING.md#0119-to-0120). diff --git a/docs/auto-instrumentation-install.md b/docs/auto-instrumentation-install.md index b4e6973ed..108ebda1d 100644 --- a/docs/auto-instrumentation-install.md +++ b/docs/auto-instrumentation-install.md @@ -458,84 +458,13 @@ helm template splunk-otel-collector-chart/splunk-otel-collector --include-crds \ | kubectl delete --dry-run=client -f - ``` -### Documentation Resources - -- https://developers.redhat.com/devnation/tech-talks/using-opentelemetry-on-kubernetes -- https://github.com/open-telemetry/opentelemetry-operator/blob/main/README.md -- https://github.com/open-telemetry/opentelemetry-operator/blob/main/docs/api.md#instrumentation -- https://github.com/open-telemetry/opentelemetry-operator/blob/main/README.md#opentelemetry-auto-instrumentation-injection -- https://github.com/open-telemetry/opentelemetry-operator/blob/main/README.md#use-customized-or-vendor-instrumentation - -### Troubleshooting the Operator and Cert Manager - -#### Check the logs for failures - -**Operator Logs:** - -```bash -kubectl logs -l app.kubernetes.io/name=operator -``` - -**Cert-Manager Logs:** - -```bash -kubectl logs -l app=certmanager -kubectl logs -l app=cainjector -kubectl logs -l app=webhook -``` - -#### Operator Issues - -##### Networking and Firewall Requirements - -Ensure the Mutating Webhook used by the operator for pod auto-instrumentation is not hindered by network policies or firewall rules. Key points to ensure: - -- **Webhook Accessibility**: The webhook must freely communicate with the cluster IP and the Kubernetes API server. Ensure network policies or firewall rules permit operator-related services to interact with these endpoints. -- **Required Ports**: Policies should explicitly allow traffic to the necessary ports for seamless operation. - -Use the following command to identify the IP addresses and ports that need to be accessible: - -```bash -kubectl get svc -n {operator_namespace} -# Example output indicating necessary IP and port configurations: -# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE -# kubernetes ClusterIP 10.0.0.1 443/TCP 10d -# splunk-splunk-otel-collector-agent ClusterIP 10.0.176.113 8006/TCP,14250/TCP,14268/TCP,... 3d17h -# splunk-splunk-otel-collector-operator ClusterIP 10.0.254.125 8443/TCP,8080/TCP 3d17h -# splunk-splunk-otel-collector-operator-webhook ClusterIP 10.0.222.223 443/TCP 3d17h -``` - -- **Configuration Action**: Adjust your network policies and firewall settings based on the service endpoints and ports listed by the command. This ensures the webhook and operator services can properly communicate within the cluster. - -#### Cert-Manager Issues - -If the operator seems to be hanging, it could be due to the cert-manager not auto-creating the required certificate. To troubleshoot: - -- Check the health and logs of the cert-manager pods for potential issues. -- Consider restarting the cert-manager pods. -- Ensure that your cluster has only one instance of cert-manager, which should include `certmanager`, `certmanager-cainjector`, and `certmanager-webhook`. - -For additional guidance, refer to the official cert-manager documentation: -- [Troubleshooting Guide](https://cert-manager.io/docs/troubleshooting/) -- [Uninstallation Guide](https://cert-manager.io/v1.2-docs/installation/uninstall/kubernetes/) - -##### Validate Certificates - -Ensure that the certificate, which the cert-manager creates and the operator utilizes, is available. - -```bash -kubectl get certificates -# NAME READY SECRET AGE -# splunk-otel-collector-operator-serving-cert True splunk-otel-collector-operator-controller-manager-service-cert 5m -``` - -#### TLS Certificate Requirement for Kubernetes Operator Webhooks +### TLS Certificate Requirement for Kubernetes Operator Webhooks In Kubernetes, the API server communicates with operator webhook components over HTTPS, which requires a valid TLS certificate that the API server trusts. The operator supports several methods for configuring the required certificate, each with different levels of complexity and security. --- -##### 1. **Using a Self-Signed Certificate Generated by the Chart** +#### 1. **Using a Self-Signed Certificate Generated by the Chart** This is the default and simplest method for generating a TLS certificate. It automatically creates a self-signed certificate for the webhook. It is suitable for internal environments or testing purposes but may not be trusted by clients outside your cluster. @@ -550,11 +479,11 @@ This is the easiest setup for users and does not require additional configuratio --- -##### 2. **Using a cert-manager Certificate** +#### 2. **Using a cert-manager Certificate** Using `cert-manager` offers more control over certificate management and is more suitable for production environments. However, due to Helm’s install/upgrade order of operations, cert-manager CRDs and certificates cannot be installed within the same Helm operation. To work around this limitation, you can choose one of the following options: -###### Option 1: **Pre-deploy cert-manager** +##### Option 1: **Pre-deploy cert-manager** If `cert-manager` is already deployed in your cluster, you can configure the operator to use it without enabling certificate generation by Helm. @@ -568,7 +497,7 @@ operator: enabled: false ``` -###### Option 2: **Deploy cert-manager and the operator together** +##### Option 2: **Deploy cert-manager and the operator together** If you need to install `cert-manager` along with the operator, use a Helm post-install or post-upgrade hook to ensure that the certificate is created after cert-manager CRDs are installed. @@ -593,7 +522,7 @@ This method is useful when installing `cert-manager` as a subchart or as part of --- -##### 3. **Using a Custom Externally Generated Certificate** +#### 3. **Using a Custom Externally Generated Certificate** For full control, you can use an externally generated certificate. This is suitable if you already have a certificate issued by a trusted CA or have specific security requirements. @@ -619,3 +548,29 @@ This method allows you to use a certificate that is trusted by external systems, --- For more advanced use cases, refer to the [official Helm chart documentation](https://github.com/open-telemetry/opentelemetry-helm-charts/blob/main/charts/opentelemetry-operator/values.yaml) for detailed configuration options and scenarios. + +### Troubleshooting the Operator and Cert Manager + +#### Check the logs for failures + +**Operator Logs:** + +```bash +kubectl logs -l app.kubernetes.io/name=operator +``` + +**Cert-Manager Logs:** + +```bash +kubectl logs -l app=certmanager +kubectl logs -l app=cainjector +kubectl logs -l app=webhook +``` + +### Documentation Resources + +- https://developers.redhat.com/devnation/tech-talks/using-opentelemetry-on-kubernetes +- https://github.com/open-telemetry/opentelemetry-operator/blob/main/README.md +- https://github.com/open-telemetry/opentelemetry-operator/blob/main/docs/api.md#instrumentation +- https://github.com/open-telemetry/opentelemetry-operator/blob/main/README.md#opentelemetry-auto-instrumentation-injection +- https://github.com/open-telemetry/opentelemetry-operator/blob/main/README.md#use-customized-or-vendor-instrumentation