Skip to content

Commit 5363a71

Browse files
zmoogfelixbarnyChrsMark
authored
[Kubernetes] Reroute container logs based on pod annotations (#7118)
* Add pipeline failure handler * Set service.name and service.version `service.name` should use value from the label `app.kubernetes.io/name` first, and then fallback to the `kubernetes.container.name` if not present. I need to double-check if I can use the container name as is of I need to parse it in some form. `service.version` use value from the label `app.kubernetes.io/version`, if present. * Add the routing rules Instead of using the resource processor, we will use the new routing rules[^1] available in 8.10. The routing rules allow Fleet to build a better pipeline where the custom pipeline is executed *before* routing the document to a different dataset or namespace. Here's an example of the final pipeline created by Fleet after the integration installation: ```json [ { "set": { "field": "service.name", "copy_from": "kubernetes.labels.app_kubernetes_io/name", "ignore_empty_value": true } }, { "set": { "field": "service.name", "copy_from": "kubernetes.container.name", "override": false, "ignore_empty_value": true } }, { "set": { "field": "service.version", "copy_from": "kubernetes.labels.app_kubernetes_io/version", "ignore_empty_value": true } }, { "pipeline": { "name": "logs-kubernetes.container_logs@custom", "ignore_missing_pipeline": true } }, { "reroute": { "tag": "kubernetes.container_logs", "dataset": [ "{{kubernetes.labels.elastic_co/dataset}}", "{{data_stream.dataset}}", "kubernetes.container_logs" ], "namespace": [ "{{kubernetes.labels.elastic_co/namespace}}", "{{data_stream.namespace}}", "default" ], "if": "ctx?.kubernetes?.labels != null" } } ] ``` We upgrade the package-spec to 2.9.0 to enable the routing rules. [^1]: elastic/package-spec#514 refs: #7118 * Expand the rerouting docs The docs now focus on describing what the routing offers and how users can customize it setting pod annotations. We offer an example at definition time (using a deployment) and runtime (using `kubectl`). * Mention container-logs routing in the README The main README file is what most users will see before and after installing the integration. Adding a short mention of the container-logs routing capability, with a link to the complete docs, could improve the discoverability of this feature without too much noise. * Docs: add a namespace customization example Show how to customize the namespace setting a label on the pod. * Update docs Rephrase the Nginx example to avoid ambiguity; the Nginx integration is not required for the routing purpose. Update the pod labels table to avoid ambiguity about the target namespace; it's the data stream namespace, not the k8s namespace. * Switch from labels to annotations We learned that Kubernetes annotations are the correct representation for data such as routing rules. The annotations docs[^1] mention the following use case for annotations: > "Directives from the end-user to the implementations to modify > behavior or engage non-standard features." So, we switch from labels to annotations. Unfortunately, the Kubernetes provider[^2] does not add annotations to the event out-of-the-box, and we can't enable this on Fleet-managed agents. So, we decided to make the relevant annotations available in the event adding field[^3] using Filebeat processors. We decided to keep the `app.kubernetes.io/name` and `app.kubernetes.io/version` metadata as labels since the Recommended Labels[^4] document mentions them. [^1]: https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/#attaching-metadata-to-objects [^2]: https://www.elastic.co/guide/en/fleet/current/kubernetes-provider.html [^3]: https://www.elastic.co/guide/en/beats/filebeat/current/add-fields.html [^4]: https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/ * Explain WHY we added processors for annotations Add a simple section that explains why we had to add a few Filebeat processors to *export* routing-focused annotations from the Kubernetes provider to the event. --------- Co-authored-by: Felix Barnsteiner <[email protected]> Co-authored-by: Chris Mark <[email protected]>
1 parent c3036f2 commit 5363a71

File tree

14 files changed

+360
-2
lines changed

14 files changed

+360
-2
lines changed

packages/kubernetes/_dev/build/docs/README.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,14 @@ the masters won't be visible. In these cases it won't be possible to use `schedu
8484
The container-logs dataset requires access to the log files in each Kubernetes node where the container logs are stored.
8585
This defaults to `/var/log/containers/*${kubernetes.container.id}.log`.
8686

87+
#### Routing
88+
89+
The container-logs data stream allows routing logs to a different *dataset* or *namespace* using pod annotations.
90+
91+
For example, suppose you are running Nginx on your Kubernetes cluster, and you want to drive the Nginx container logs into a dedicated dataset or namespace. By annotating the pod with `elastic.co/namespace: nginx`, the integration will send all the container logs to the `nginx` namespace.
92+
93+
To learn more about routing container-logs, see https://docs.elastic.co/integrations/kubernetes/container-logs.
94+
8795
### audit-logs
8896

8997
The audit-logs dataset requires access to the log files on each Kubernetes node where the audit logs are stored.

packages/kubernetes/_dev/build/docs/container-logs.md

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,3 +6,81 @@ It requires access to the log files in each Kubernetes node where the container
66
This defaults to `/var/log/containers/*${kubernetes.container.id}.log`.
77

88
By default only {{ url "filebeat-input-filestream-parsers" "container parser" }} is enabled. Additional log parsers can be added as an advanced options configuration.
9+
10+
11+
## Rerouting based on pod annotations
12+
13+
You can customize the routing of container logs events and sending them to different datasets and namespaces using pods' annotations.
14+
15+
Routing customization can happen at:
16+
17+
- pod definition time, e.g., using a deployment.
18+
- pod runtime, annotating pods using `kubectl`.
19+
20+
21+
### Set routing at pod definition time
22+
23+
Here is an example of an Nginx deployment where we set both `elastic.co/dataset` and `elastic.co/namespace` annotations to route the container logs to specific datasets and namespace, respectively.
24+
25+
```yaml
26+
# nginx-deployment.yaml
27+
apiVersion: apps/v1
28+
kind: Deployment
29+
metadata:
30+
name: nginx-deployment
31+
spec:
32+
replicas: 1
33+
selector:
34+
matchLabels:
35+
app: nginx
36+
template:
37+
metadata:
38+
annotations:
39+
elastic.co/dataset: kubernetes.container_logs.nginx
40+
elastic.co/namespace: nginx
41+
labels:
42+
app: nginx
43+
app.kubernetes.io/name: myservice
44+
app.kubernetes.io/version: v0.1.2
45+
app.kubernetes.io/instance: myservice-abcxzy
46+
spec:
47+
containers:
48+
- name: nginx-container
49+
image: nginx:latest
50+
ports:
51+
- containerPort: 80
52+
```
53+
54+
55+
### Set routing at runtime
56+
57+
Suppose you want to change the container logs routing on a running container. In that case, you can annotate the pod using `kubectl`, and the integration will apply it immediately sending all the following documents to the new destination:
58+
59+
Here is an example where we route the container logs for a pod running the Elastic Agent to the `kubernetes.container_logs.agents` dataset:
60+
61+
```shell
62+
kubectl annotate pods elastic-agent-managed-daemonset-6p22g elastic.co/dataset=kubernetes.container_logs.agents
63+
```
64+
65+
Here's a similar example to change the namespace on a pod running Nginx:
66+
67+
```shell
68+
kubectl annotate pods elastic-agent-managed-daemonset-6p22g elastic.co/namespace=nginx
69+
```
70+
71+
You can restore the standard routing by removing the annotations:
72+
73+
```shell
74+
kubectl annotate pods elastic-agent-managed-daemonset-6p22g elastic.co/dataset-
75+
kubectl annotate pods elastic-agent-managed-daemonset-6p22g elastic.co/namespace-
76+
```
77+
78+
### Annotations Reference
79+
80+
Here are the annotations available to customize routing:
81+
82+
83+
| Label | Description |
84+
| ---------------------- | -------------------------------------------------------- |
85+
| `elastic.co/dataset` | Defines the target data stream's dataset for this pod. |
86+
| `elastic.co/namespace` | Defines the target data stream's namespace for this pod. |

packages/kubernetes/changelog.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,9 @@
11
# newer versions go on top
2+
- version: "1.45.0"
3+
changes:
4+
- description: Reroute container logs based on pod annotations.
5+
type: enhancement
6+
link: https://github.com/elastic/integrations/pull/7118
27
- version: "1.44.0"
38
changes:
49
- description: Introducing kubernetes.deployment.status.* metrics
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
fields:
2+
data_stream:
3+
dataset: kubernetes.container_logs
4+
namespace: default
5+
kubernetes:
6+
annotations:
7+
elastic_co/dataset: kubernetes.container_logs.nginx
8+
elastic_co/namespace: nginx
9+
labels:
10+
app_kubernetes_io/version: "v0.1.0"
11+
app_kubernetes_io/name: "myservice"
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
2023/07/25 15:24:11 [notice] 1#1: start worker process 33
2+
2023/07/25 15:24:11 [notice] 1#1: start worker process 34
3+
2023/07/25 15:24:11 [notice] 1#1: start worker process 35
4+
2023/07/25 15:24:11 [notice] 1#1: using the "epoll" event method
Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
{
2+
"expected": [
3+
{
4+
"data_stream": {
5+
"dataset": "kubernetes.container_logs.nginx",
6+
"namespace": "nginx",
7+
"type": "logs"
8+
},
9+
"kubernetes": {
10+
"annotations": {
11+
"elastic_co/dataset": "kubernetes.container_logs.nginx",
12+
"elastic_co/namespace": "nginx"
13+
},
14+
"labels": {
15+
"app_kubernetes_io/name": "myservice",
16+
"app_kubernetes_io/version": "v0.1.0"
17+
}
18+
},
19+
"message": "2023/07/25 15:24:11 [notice] 1#1: start worker process 33",
20+
"service": {
21+
"name": "myservice",
22+
"version": "v0.1.0"
23+
}
24+
},
25+
{
26+
"data_stream": {
27+
"dataset": "kubernetes.container_logs.nginx",
28+
"namespace": "nginx",
29+
"type": "logs"
30+
},
31+
"kubernetes": {
32+
"annotations": {
33+
"elastic_co/dataset": "kubernetes.container_logs.nginx",
34+
"elastic_co/namespace": "nginx"
35+
},
36+
"labels": {
37+
"app_kubernetes_io/name": "myservice",
38+
"app_kubernetes_io/version": "v0.1.0"
39+
}
40+
},
41+
"message": "2023/07/25 15:24:11 [notice] 1#1: start worker process 34",
42+
"service": {
43+
"name": "myservice",
44+
"version": "v0.1.0"
45+
}
46+
},
47+
{
48+
"data_stream": {
49+
"dataset": "kubernetes.container_logs.nginx",
50+
"namespace": "nginx",
51+
"type": "logs"
52+
},
53+
"kubernetes": {
54+
"annotations": {
55+
"elastic_co/dataset": "kubernetes.container_logs.nginx",
56+
"elastic_co/namespace": "nginx"
57+
},
58+
"labels": {
59+
"app_kubernetes_io/name": "myservice",
60+
"app_kubernetes_io/version": "v0.1.0"
61+
}
62+
},
63+
"message": "2023/07/25 15:24:11 [notice] 1#1: start worker process 35",
64+
"service": {
65+
"name": "myservice",
66+
"version": "v0.1.0"
67+
}
68+
},
69+
{
70+
"data_stream": {
71+
"dataset": "kubernetes.container_logs.nginx",
72+
"namespace": "nginx",
73+
"type": "logs"
74+
},
75+
"kubernetes": {
76+
"annotations": {
77+
"elastic_co/dataset": "kubernetes.container_logs.nginx",
78+
"elastic_co/namespace": "nginx"
79+
},
80+
"labels": {
81+
"app_kubernetes_io/name": "myservice",
82+
"app_kubernetes_io/version": "v0.1.0"
83+
}
84+
},
85+
"message": "2023/07/25 15:24:11 [notice] 1#1: using the \"epoll\" event method",
86+
"service": {
87+
"name": "myservice",
88+
"version": "v0.1.0"
89+
}
90+
}
91+
]
92+
}

packages/kubernetes/data_stream/container_logs/agent/stream/stream.yml.hbs

Lines changed: 36 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,43 @@ parsers:
1515
format: {{ containerParserFormat }}
1616
{{ additionalParsersConfig }}
1717

18-
{{#if processors}}
1918
processors:
19+
{{!
20+
Why do we need to add the following processors?
21+
-----------------------------------------------
22+
23+
The kubernetes provider supports[^1] pods annotations, making it possible to add
24+
them to the event using the `include_annotations` configuration option.
25+
26+
However, adding annotations to the event is disabled by default, and it is
27+
not possible to enable it on Fleet-managed agents.
28+
29+
The following processors are a workaround to add the annotations to the event
30+
without using the `include_annotations` configuration option.
31+
32+
33+
[^1]: https://github.com/elastic/elastic-agent/blob/37ec2bb7ee1d2cc6c0fccf2f0cd0a44eb3d61efd/internal/pkg/composable/providers/kubernetes/pod.go#L311-L315
34+
}}
35+
- add_fields:
36+
target: kubernetes
37+
fields:
38+
annotations.elastic_co/dataset: ${kubernetes.annotations.elastic.co/dataset|""}
39+
annotations.elastic_co/namespace: ${kubernetes.annotations.elastic.co/namespace|""}
40+
- drop_fields:
41+
fields:
42+
- kubernetes.annotations.elastic_co/dataset
43+
when:
44+
equals:
45+
kubernetes.annotations.elastic_co/dataset: ""
46+
ignore_missing: true
47+
- drop_fields:
48+
fields:
49+
- kubernetes.annotations.elastic_co/namespace
50+
when:
51+
equals:
52+
kubernetes.annotations.elastic_co/namespace: ""
53+
ignore_missing: true
54+
{{#if processors}}
2055
{{processors}}
2156
{{/if}}
2257

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
---
2+
description: Pipeline for Kubernetes container logs
3+
processors:
4+
- set:
5+
field: service.name
6+
copy_from: kubernetes.labels.app_kubernetes_io/name
7+
ignore_empty_value: true
8+
- set:
9+
field: service.name
10+
copy_from: kubernetes.container.name
11+
override: false
12+
ignore_empty_value: true
13+
- set:
14+
field: service.version
15+
copy_from: kubernetes.labels.app_kubernetes_io/version
16+
ignore_empty_value: true
17+
on_failure:
18+
- set:
19+
field: event.kind
20+
value: pipeline_error
21+
- append:
22+
field: error.message
23+
value: '{{{ _ingest.on_failure_message }}}'

packages/kubernetes/data_stream/container_logs/fields/ecs.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,3 +22,7 @@
2222
name: orchestrator.cluster.name
2323
- external: ecs
2424
name: orchestrator.cluster.url
25+
- external: ecs
26+
name: service.name
27+
- external: ecs
28+
name: service.version

packages/kubernetes/data_stream/container_logs/manifest.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
title: "Kubernetes container logs"
22
type: logs
3+
dataset: kubernetes.container_logs
34
streams:
45
- input: filestream
56
title: Collect Kubernetes container logs
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# Route container logs events to the correct dataset and namespace
2+
# based on pod annotations.
3+
- source_dataset: kubernetes.container_logs
4+
rules:
5+
- target_dataset:
6+
- "{{kubernetes.annotations.elastic_co/dataset}}"
7+
- "{{data_stream.dataset}}"
8+
namespace:
9+
- "{{kubernetes.annotations.elastic_co/namespace}}"
10+
- "{{data_stream.namespace}}"
11+
if: "ctx.kubernetes?.annotations != null"

packages/kubernetes/docs/README.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,14 @@ the masters won't be visible. In these cases it won't be possible to use `schedu
8484
The container-logs dataset requires access to the log files in each Kubernetes node where the container logs are stored.
8585
This defaults to `/var/log/containers/*${kubernetes.container.id}.log`.
8686

87+
#### Routing
88+
89+
The container-logs data stream allows routing logs to a different *dataset* or *namespace* using pod annotations.
90+
91+
For example, suppose you are running Nginx on your Kubernetes cluster, and you want to drive the Nginx container logs into a dedicated dataset or namespace. By annotating the pod with `elastic.co/namespace: nginx`, the integration will send all the container logs to the `nginx` namespace.
92+
93+
To learn more about routing container-logs, see https://docs.elastic.co/integrations/kubernetes/container-logs.
94+
8795
### audit-logs
8896

8997
The audit-logs dataset requires access to the log files on each Kubernetes node where the audit logs are stored.

0 commit comments

Comments
 (0)