Skip to content

Commit 8fe73a1

Browse files
zmoogjsoriano
andauthored
Render Mustache expressions in routing rules datasets values (#1393)
* Render Mustache expressions in datasets value This is a quick PoC to test using the github.com/hoisie/mustache library to render Mustache expressions in datasets values. Datasets values containing mustache expressions like: {{kubernetes.labels.elastic_co/dataset}} are rendered using document values. Currently, the principal limit is the validator is using processed documents (docs returned by the ingest pipeline), but for this purpose of using the original documents instead (docs read from the test case file). * Add tests for dataset expressions rendering * Switch to the github.com/cbroglie/mustache library I just noticed the original github.com/hoisie/mustache seems unmaintained (last commit happened on Aug 5, 2016). So I switched to github.com/cbroglie/mustache, a fork of the same repo that looks active. They added essentials improvements, like returning an error on the `Render()` API. * Fix routing_rules.yml format I didn't run `elastic-package check` in the kubernetes test package. Without a proper format, the test detects a dirty git working directory at the end of the test run. * Remove the kubernetes test package We already have one at: test/packages/with-kind/kubernetes So this one is redundant. Dropping it! * Add a test rule with expression To test that routing rules work with datasets and namespaces, we add another rule that uses the `{{cloud.account.id}}` for both if the `cloud.region` is `eu-west-1`. * Update expression tests Changing the expression test to make it more realistic. The main challenge here is to find a source event field value I can use for a dataset or namespace: it can't contain `-`. * Update internal/fields/validate.go Co-authored-by: Jaime Soriano Pastor <[email protected]> * Make static checker happy * Make more realistic tests adding a `labels` field Even if Firehose currently does not provide an `aws.labels` field, using it in the routing rules tests is probably less confusing than using the field `{{cloud.account.id}}` as target dataset or namespace. --------- Co-authored-by: Jaime Soriano Pastor <[email protected]>
1 parent 71c04d7 commit 8fe73a1

File tree

7 files changed

+113
-3
lines changed

7 files changed

+113
-3
lines changed

go.mod

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ require (
88
github.com/ProtonMail/gopenpgp/v2 v2.7.2
99
github.com/aymerick/raymond v2.0.2+incompatible
1010
github.com/boumenot/gocover-cobertura v1.2.0
11+
github.com/cbroglie/mustache v1.4.0
1112
github.com/cespare/xxhash/v2 v2.2.0
1213
github.com/dustin/go-humanize v1.0.1
1314
github.com/elastic/elastic-integration-corpus-generator-tool v0.5.0

go.sum

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,8 @@ github.com/aymerick/raymond v2.0.2+incompatible/go.mod h1:osfaiScAUVup+UC9Nfq76e
5151
github.com/boumenot/gocover-cobertura v1.2.0 h1:g+VROIASoEHBrEilIyaCmgo7HGm+AV5yKEPLk0qIY+s=
5252
github.com/boumenot/gocover-cobertura v1.2.0/go.mod h1:fz7ly8dslE42VRR5ZWLt2OHGDHjkTiA2oNvKgJEjLT0=
5353
github.com/bwesterb/go-ristretto v1.2.3/go.mod h1:fUIoIZaG73pV5biE2Blr2xEzDoMj7NFEuV9ekS419A0=
54+
github.com/cbroglie/mustache v1.4.0 h1:Azg0dVhxTml5me+7PsZ7WPrQq1Gkf3WApcHMjMprYoU=
55+
github.com/cbroglie/mustache v1.4.0/go.mod h1:SS1FTIghy0sjse4DUVGV1k/40B1qE1XkD9DtDsHo9iM=
5456
github.com/census-instrumentation/opencensus-proto v0.2.1/go.mod h1:f6KPmirojxKA12rnyqOA5BBL4O983OfeGPqjHWSTneU=
5557
github.com/cespare/xxhash/v2 v2.2.0 h1:DC2CZ1Ep5Y4k3ZQ899DldepgrayRUGE6BBZ/cd9Cj44=
5658
github.com/cespare/xxhash/v2 v2.2.0/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs=

internal/fields/validate.go

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ import (
1818
"strings"
1919

2020
"github.com/Masterminds/semver/v3"
21+
"github.com/cbroglie/mustache"
2122
"gopkg.in/yaml.v3"
2223

2324
"github.com/elastic/elastic-package/internal/common"
@@ -316,12 +317,33 @@ func (v *Validator) validateDocumentValues(body common.MapStr) multierror.Error
316317
if !v.specVersion.LessThan(semver2_0_0) && v.expectedDatasets != nil {
317318
for _, datasetField := range datasetFieldNames {
318319
value, err := body.GetValue(datasetField)
319-
if err == common.ErrKeyNotFound {
320+
if errors.Is(err, common.ErrKeyNotFound) {
320321
continue
321322
}
322323

324+
// Why do we render the expected datasets here?
325+
// Because the expected datasets can contain
326+
// mustache templates, and not just static
327+
// strings.
328+
//
329+
// For example, the expected datasets for the
330+
// Kubernetes container logs dataset can be:
331+
//
332+
// - "{{kubernetes.labels.elastic_co/dataset}}"
333+
//
334+
var renderedExpectedDatasets []string
335+
for _, dataset := range v.expectedDatasets {
336+
renderedDataset, err := mustache.Render(dataset, body)
337+
if err != nil {
338+
err := fmt.Errorf("can't render expected dataset %q: %w", dataset, err)
339+
errs = append(errs, err)
340+
return errs
341+
}
342+
renderedExpectedDatasets = append(renderedExpectedDatasets, renderedDataset)
343+
}
344+
323345
str, ok := valueToString(value, v.disabledNormalization)
324-
exists := stringInArray(str, v.expectedDatasets)
346+
exists := stringInArray(str, renderedExpectedDatasets)
325347
if !ok || !exists {
326348
err := fmt.Errorf("field %q should have value in %q, it has \"%v\"",
327349
datasetField, v.expectedDatasets, value)
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
{
2+
"events": [
3+
{
4+
"cloud": {
5+
"region": "us-east-2",
6+
"account": {
7+
"id": "428152502467"
8+
},
9+
"provider": "aws"
10+
},
11+
"aws": {
12+
"firehose": {
13+
"arn": "arn:aws:firehose:us-east-2:428152502467:deliverystream/test-vpcflow-logs",
14+
"request_id": "1cfbed13-d631-4b8b-b20a-b7c5bf8fcd00"
15+
},
16+
"kinesis": {
17+
"name": "test-vpcflow-logs",
18+
"type": "deliverystream"
19+
},
20+
"labels": {
21+
"elastic_co/dataset": "mydataset",
22+
"elastic_co/namespace": "mynamespace"
23+
}
24+
},
25+
"data_stream": {
26+
"namespace": "default",
27+
"type": "logs",
28+
"dataset": "generic"
29+
},
30+
"@timestamp": "2023-08-23T16:47:26Z",
31+
"message": "{\"message\":\"2 428152502467 eni-0b584e1c714317ac6 176.111.174.91 10.0.0.102 41536 1135 6 1 40 1692809104 1692809162 REJECT OK\"}\n"
32+
}
33+
]
34+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
{
2+
"expected": [
3+
{
4+
"@timestamp": "2023-08-23T16:47:26Z",
5+
"aws": {
6+
"firehose": {
7+
"arn": "arn:aws:firehose:us-east-2:428152502467:deliverystream/test-vpcflow-logs",
8+
"request_id": "1cfbed13-d631-4b8b-b20a-b7c5bf8fcd00"
9+
},
10+
"kinesis": {
11+
"name": "test-vpcflow-logs",
12+
"type": "deliverystream"
13+
},
14+
"labels": {
15+
"elastic_co/dataset": "mydataset",
16+
"elastic_co/namespace": "mynamespace"
17+
}
18+
},
19+
"cloud": {
20+
"account": {
21+
"id": "428152502467"
22+
},
23+
"provider": "aws",
24+
"region": "us-east-2"
25+
},
26+
"data_stream": {
27+
"dataset": "mydataset",
28+
"namespace": "mynamespace",
29+
"type": "logs"
30+
},
31+
"ecs": {
32+
"version": "8.0.0"
33+
},
34+
"message": "{\"message\":\"2 428152502467 eni-0b584e1c714317ac6 176.111.174.91 10.0.0.102 41536 1135 6 1 40 1692809104 1692809162 REJECT OK\"}\n"
35+
}
36+
]
37+
}

test/packages/parallel/awsfirehose/data_stream/log/fields/fields.yml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,3 +38,12 @@
3838
type: keyword
3939
description: |-
4040
Kinesis type.
41+
- name: labels
42+
type: object
43+
fields:
44+
- name: elastic_co/dataset
45+
type: keyword
46+
description: Fictional field to test datasets routing rules.
47+
- name: elastic_co/namespace
48+
type: keyword
49+
description: Fictional field to test namespaces routing rules.
Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,11 @@
11
- source_dataset: awsfirehose.log
22
rules:
3+
- target_dataset:
4+
- "{{aws.labels.elastic_co/dataset}}"
5+
namespace:
6+
- "{{aws.labels.elastic_co/namespace}}"
7+
if: "ctx.aws?.labels != null"
38
- target_dataset: aws.cloudtrail
4-
if: ctx['aws.cloudwatch.log_stream'].contains('CloudTrail')
9+
if: ctx['aws.cloudwatch.log_stream'].contains('CloudTrail') == true
510
namespace:
611
- default

0 commit comments

Comments
 (0)