|
| 1 | +--- |
| 2 | +title: workload_identity_support_for_azure_monitor_logs |
| 3 | +authors: |
| 4 | + - "@calee" |
| 5 | +reviewers: |
| 6 | + - "@jcantrill" |
| 7 | + - "@alanconway" |
| 8 | +approvers: |
| 9 | + - "@jcantrill" |
| 10 | + - "@alanconway" |
| 11 | +api-approvers: |
| 12 | + - "@jcantrill" |
| 13 | + - "@alanconway" |
| 14 | +creation-date: 2025-04-30 |
| 15 | +last-updated: 2025-04-30 |
| 16 | +status: implementable |
| 17 | +tracking-link: |
| 18 | + - https://issues.redhat.com/browse/LOG-4782 |
| 19 | +--- |
| 20 | + |
| 21 | +# Workload Identity Support for Azure Monitor Logs |
| 22 | + |
| 23 | +## Release Sign-off Checklist |
| 24 | + |
| 25 | +- [ ] Enhancement is `implementable` |
| 26 | +- [ ] Design details are appropriately documented from clear requirements |
| 27 | +- [ ] Test plan is defined |
| 28 | +- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/) |
| 29 | + |
| 30 | +## Summary |
| 31 | + |
| 32 | +[Azure Monitor Logs](https://learn.microsoft.com/en-us/azure/azure-monitor/logs/data-platform-logs) is a comprehensive service provided by Microsoft Azure that enables the collection, analysis, and actioning of telemetry data across various Azure and on-premises resources. |
| 33 | + |
| 34 | +This proposal enhances the Azure Monitor Logs integration by implementing secure, short-lived authentication with [Microsoft Entra Workload Identity (WID)](https://learn.microsoft.com/en-us/entra/workload-id/workload-identities-overview) through federated tokens. The update will leverage a pending, upstream Vector PR, [azure_logs_ingestion feature](https://github.com/vectordotdev/vector/pull/22912), that will utilize the new Log Ingestion API. |
| 35 | + |
| 36 | +## Motivation |
| 37 | + |
| 38 | +The current Azure Monitor Logs integration relies on long-lived credentials ([shared_key](https://learn.microsoft.com/en-us/previous-versions/azure/azure-monitor/logs/data-collector-api?tabs=powershell#authorization)) and a deprecated [API](https://learn.microsoft.com/en-us/previous-versions/azure/azure-monitor/logs/data-collector-api), which poses potential security risks. By adopting Microsoft Entra Workload Identity (WID), this enhancement ensures secure, short-lived credential access to Azure Monitor Logs while removing reliance on a deprecated, soon to be retired, API. |
| 39 | + |
| 40 | +### User Stories |
| 41 | + |
| 42 | +- As an administrator, I want to be able to forward logs from my OpenShift cluster to Azure Monitor Logs using federated tokens, removing the need for long-lived, static credentials. |
| 43 | + |
| 44 | +### Goals |
| 45 | + |
| 46 | +- Enable Vector's Azure Monitor Logs sink to authenticate with short-lived federated token credentials. |
| 47 | + |
| 48 | +### Non-Goals |
| 49 | + |
| 50 | +- Supporting authentication other than short lived federated token credentials. |
| 51 | + |
| 52 | +## Proposal |
| 53 | + |
| 54 | +To realize the goals of this enhancement: |
| 55 | + |
| 56 | +- Switch over to the new Azure Log Ingestion vector sink when implemented. |
| 57 | + - See #1 in [implementation details](#implementation-detailsnotesconstraints) section. |
| 58 | +- Update Vector's rust [Azure Identity](https://github.com/Azure/azure-sdk-for-rust/tree/main/sdk/identity/azure_identity) client library to `v0.23.0`. |
| 59 | + - See #2 in [implementation details](#implementation-detailsnotesconstraints) section. |
| 60 | +- Extend Vector's Azure Log Ingestion sink to accept additional configuration for workload identity authentication. |
| 61 | +- Extend the ClusterLogForwarder’s Azure Monitor integration to support the required fields of the Log Ingestion API, including additional authentication fields for workload identity. |
| 62 | + |
| 63 | +### Workflow Description |
| 64 | + |
| 65 | +The Vector collector will: |
| 66 | + |
| 67 | +1. Determine the authentication type using a configurable field (`credential_kind`). |
| 68 | +2. Extract the Openshift Service Account token from the local volume. |
| 69 | +3. Exchange the Openshift token with Microsoft identity platform for a short-lived access token. |
| 70 | +4. Use the access token in the log forwarding request to Azure Monitor Logs. |
| 71 | + |
| 72 | +The ClusterLogForwarder will: |
| 73 | + |
| 74 | +1. Determine which authentication method to use based on a configurable field on the `azureMonitorAuthentication`. |
| 75 | +2. Conditionally project the service account token if the type is `workloadIdentity`. |
| 76 | +3. Create the collector configuration with required fields for the Log Ingestion API along with the path to the projected service account token. |
| 77 | + |
| 78 | +### Proposed API |
| 79 | + |
| 80 | +#### Additional configuration fields for the `azureMonitor` output type to the `ClusterLogForwarder` API |
| 81 | + |
| 82 | +Output |
| 83 | + |
| 84 | +```Go |
| 85 | +type AzureMonitor struct { |
| 86 | + ...// Keep rest of options |
| 87 | + |
| 88 | + // The Data Collection Endpoint or Data Collection Rule logs ingestion endpoint URL. |
| 89 | + // |
| 90 | + // https://learn.microsoft.com/en-us/azure/azure-monitor/logs/logs-ingestion-api-overview#endpoint |
| 91 | + // |
| 92 | + // +kubebuilder:validation:Optional |
| 93 | + // +kubebuilder:validation:XValidation:rule="isURL(self)", message="invalid URL" |
| 94 | + // +operator-sdk:csv:customresourcedefinitions:type=spec,displayName="Log Ingestion Endpoint",xDescriptors={"urn:alm:descriptor:com.tectonic.ui:text"} |
| 95 | + LogIngestionEndpoint string `json:"logIngestionEndpoint,omitempty"` |
| 96 | + |
| 97 | + // The DCR Immutable ID |
| 98 | + // |
| 99 | + // A unique identifier for the data collection rule. This property and its value are automatically created when the DCR is created. |
| 100 | + // |
| 101 | + // +kubebuilder:validation:Optional |
| 102 | + // +operator-sdk:csv:customresourcedefinitions:type=spec,displayName="DCR Immutable ID",xDescriptors={"urn:alm:descriptor:com.tectonic.ui:text"} |
| 103 | + DcrImmutableId string `json:"dcrImmutableId,omitempty"` |
| 104 | + |
| 105 | + // The stream in the DCR that should handle the custom data |
| 106 | + // |
| 107 | + // https://learn.microsoft.com/en-us/azure/azure-monitor/data-collection/data-collection-rule-structure#input-streams |
| 108 | + // |
| 109 | + // +kubebuilder:validation:Optional |
| 110 | + // +operator-sdk:csv:customresourcedefinitions:type=spec,displayName="Stream Name",xDescriptors={"urn:alm:descriptor:com.tectonic.ui:text"} |
| 111 | + StreamName string `json:"streamName,omitempty"` |
| 112 | +} |
| 113 | +``` |
| 114 | + |
| 115 | +Authentication |
| 116 | + |
| 117 | +```Go |
| 118 | +type AzureMonitorAuthentication struct { |
| 119 | + // Type is the type of Azure authentication to configure. |
| 120 | + // |
| 121 | + // Valid types are: |
| 122 | + // 1. sharedKey |
| 123 | + // 2. workloadIdentity |
| 124 | + // |
| 125 | + // +kubebuilder:validation:Required |
| 126 | + // +operator-sdk:csv:customresourcedefinitions:type=spec,displayName="Authentication Type" |
| 127 | + Type AzureAuthType `json:"type"` |
| 128 | + |
| 129 | + // Token specifies a bearer token to be used for authenticating requests. |
| 130 | + // |
| 131 | + // +nullable |
| 132 | + // +kubebuilder:validation:Optional |
| 133 | + // +operator-sdk:csv:customresourcedefinitions:type=spec,displayName="Bearer Token" |
| 134 | + Token *BearerToken `json:"token,omitempty"` |
| 135 | + |
| 136 | + // ClientId points to the secret containing the client ID used for authentication. |
| 137 | + // |
| 138 | + // https://learn.microsoft.com/en-us/azure/azure-monitor/data-collection/data-collection-rule-structure#input-streams |
| 139 | + // |
| 140 | + // +kubebuilder:validation:Optional |
| 141 | + // +operator-sdk:csv:customresourcedefinitions:type=spec,displayName="Client ID",xDescriptors={"urn:alm:descriptor:com.tectonic.ui:text"} |
| 142 | + ClientId *SecretReference `json:"clientId,omitempty"` |
| 143 | + |
| 144 | + // TenantId points to the secret containing the tenant ID used for authentication. |
| 145 | + // |
| 146 | + // https://learn.microsoft.com/en-us/azure/azure-monitor/data-collection/data-collection-rule-structure#input-streams |
| 147 | + // |
| 148 | + // +kubebuilder:validation:Optional |
| 149 | + // +operator-sdk:csv:customresourcedefinitions:type=spec,displayName="Stream Name",xDescriptors={"urn:alm:descriptor:com.tectonic.ui:text"} |
| 150 | + TenantId *SecretReference `json:"tenantId,omitempty"` |
| 151 | + |
| 152 | + // SharedKey points to the secret containing the shared key used for authenticating requests. |
| 153 | + // |
| 154 | + // +nullable |
| 155 | + // +kubebuilder:validation:Optional |
| 156 | + // +operator-sdk:csv:customresourcedefinitions:type=spec,displayName="Shared Key" |
| 157 | + SharedKey *SecretReference `json:"sharedKey,omitempty"` |
| 158 | +} |
| 159 | +``` |
| 160 | + |
| 161 | +- `TenantId` and `ClientId` can be found from the generated [CCO utility secret](#cco-utility-secret) when Openshift is set up for [workload identity for Azure](https://github.com/openshift/cloud-credential-operator/blob/9c3346aea5a7f9a38713c09d11605b8ee825446c/docs/azure_workload_identity.md). |
| 162 | + |
| 163 | +#### Additional configuration fields for the Azure Logs Ingestion sink to `Vector` API |
| 164 | + |
| 165 | +```Rust |
| 166 | +pub struct AzureLogsIngestionConfig { |
| 167 | + /// The [Federated Token File Path][federated_token_file_path] pointing to a federated token for authentication. |
| 168 | + /// |
| 169 | + /// [token_path]: https://learn.microsoft.com/en-us/entra/identity-platform/v2-oauth2-client-creds-grant-flow#third-case-access-token-request-with-a-federated-credential |
| 170 | + #[configurable(metadata(docs::examples = "/path/to/my/token"))] |
| 171 | + pub federated_token_file_path: Option<String>, |
| 172 | + |
| 173 | + /// The [Tenant ID][tenant_id] for authentication. |
| 174 | + /// |
| 175 | + /// [tenant_id]: https://learn.microsoft.com/en-us/entra/identity-platform/v2-oauth2-client-creds-grant-flow#third-case-access-token-request-with-a-federated-credential |
| 176 | + #[configurable(metadata(docs::examples = "11111111-2222-3333-4444-555555555555"))] |
| 177 | + pub tenant_id: Option<String>, |
| 178 | + |
| 179 | + /// The [Client ID][client_id] for authentication. |
| 180 | + /// |
| 181 | + /// [client_id]: https://learn.microsoft.com/en-us/entra/identity-platform/v2-oauth2-client-creds-grant-flow#third-case-access-token-request-with-a-federated-credential |
| 182 | + #[configurable(metadata(docs::examples = "11111111-2222-3333-4444-555555555555"))] |
| 183 | + pub client_id: Option<String>, |
| 184 | + |
| 185 | + /// The [Credential Kind][credential_kind] for authentication. |
| 186 | + /// |
| 187 | + /// [client_id]: https://learn.microsoft.com/en-us/entra/identity-platform/v2-oauth2-client-creds-grant-flow#third-case-access-token-request-with-a-federated-credential |
| 188 | + #[configurable(metadata(docs::examples = "workloadIdentity"))] |
| 189 | + pub credential_kind: String, |
| 190 | +} |
| 191 | +``` |
| 192 | + |
| 193 | +### Implementation Details/Notes/Constraints |
| 194 | + |
| 195 | +1. Relies on [this upstream vector PR](https://github.com/vectordotdev/vector/pull/22912) to implement the Azure Log Ingestion sink utilizing the Log Ingestion API. |
| 196 | +2. Current `master` branch of [upstream Vector ](https://github.com/vectordotdev/vector), `>=v0.46.1`, as of 04/29/2025, utilizes `[email protected]` which relies soley on environment variables for workload identity credentials and will not be sufficient when forwarding to multiple different Azure sinks. The PR above relies on `[email protected]`. |
| 197 | + - `[email protected]` allows for setting `client_id`, `tenant_id`, etc. for authentication. |
| 198 | + - [Workload Identity Credentials [email protected] SDK Ref ](https://github.com/Azure/azure-sdk-for-rust/blob/azure_identity%400.23.0/sdk/identity/azure_identity/src/credentials/workload_identity_credentials.rs) |
| 199 | +3. The `customer_id` and `log_type` fields can now optional. |
| 200 | +4. `Shared_key` field can now be optional with option to choose type of authentication. |
| 201 | +5. Additional fields will be required in sink configuration in Vector's API. See [proposed API](#additional-configuration-fields-for-the-azure-logs-ingestion-sink-to-vector-api) above. |
| 202 | + |
| 203 | +#### CCO Utility Secret |
| 204 | + |
| 205 | +After creation of a managed identity using the CCO utility. The following secret is created and can be used for authentication: |
| 206 | + |
| 207 | +```yaml |
| 208 | +apiVersion: v1 |
| 209 | +kind: Secret |
| 210 | +metadata: |
| 211 | + name: azure-test-secret |
| 212 | + namespace: openshift-logging |
| 213 | +type: Opaque |
| 214 | +stringData: |
| 215 | + azure_client_id: 11111111-2222-3333-4444-555555555555 |
| 216 | + azure_tenant_id: 11111111-2222-3333-4444-555555555555 |
| 217 | + azure_region: westus |
| 218 | + azure_subscription_id: 11111111-2222-3333-4444-555555555555 |
| 219 | + azure_federated_token_file: /path/to/serviceaccount/token |
| 220 | + |
| 221 | +``` |
| 222 | + |
| 223 | +### Open Questions |
| 224 | + |
| 225 | +1. Do we also want to implement long-lived credential support using the Log Ingestion API? |
| 226 | +2. Do we want to start deprecating the fields for the HTTP data collector API? |
| 227 | + |
| 228 | +### Test Plan |
| 229 | + |
| 230 | +- Manual E2E tests: Need access to Azure accounts along with an Openshift cluster configured to use Azure's workload identity. |
| 231 | + |
| 232 | +## Alternatives (Not Implemented) |
| 233 | + |
| 234 | +### Vector's Azure Identity Rust SDK |
| 235 | + |
| 236 | +- Add a patch to `azure_identity` crate to allow setting `client_id`, `tenant_id`, etc. instead of relying on environment variables for workload identity credentials until Vector updates the crate version. See [implementation details](#implementation-detailsnotesconstraints). |
| 237 | + |
| 238 | +### Risks and Mitigations |
| 239 | + |
| 240 | +### Drawbacks |
| 241 | + |
| 242 | +## Design Details |
| 243 | + |
| 244 | +### Graduation Criteria |
| 245 | + |
| 246 | +#### Dev Preview -> Tech Preview |
| 247 | + |
| 248 | +#### Tech Preview -> GA |
| 249 | + |
| 250 | +#### Removing a deprecated feature |
| 251 | + |
| 252 | +### Upgrade / Downgrade Strategy |
| 253 | + |
| 254 | +### Version Skew Strategy |
| 255 | + |
| 256 | +### Operational Aspects of API Extensions |
| 257 | + |
| 258 | +#### Failure Modes |
| 259 | + |
| 260 | +#### Support Procedures |
| 261 | + |
| 262 | +## Implementation History |
| 263 | + |
| 264 | +### API Extensions |
0 commit comments