feat(codec,sources): influxdb line protcol decoder #19637

MichaHoffmann · 2024-01-17T07:29:44Z

pront

Hello @MichaHoffmann, thank you for this PR! I know this is a late response but this fell through the cracks. Left a few comments.

lib/codecs/src/decoding/format/influxdb.rs

pront · 2024-02-12T14:06:05Z

lib/codecs/src/decoding/format/influxdb.rs

+    }
+}
+
+#[cfg(test)]


Can we add a test per metric type?

Based on this comment, it should be okay to use the gauge data type for Influx's 'measurement,' right? Since we don't have encoding for InfluxDB output.

Should we treat other metric types as an error?

@pront Sorry to get back to you. Do we need that? The format for Influx line protocol is like this and doesn't specify any metric types: cpu,host=A,region=west usage_system=64i,usage_user=10i 1590488773254420000. We're just parsing it into the gauge metric type. Since the input is always like this, does the metric type matter? I fixed the formatting issue. In the upcoming PR, I can add support for InfluxDB encoding so we can specify the metric type tag and support all metric types. Is it okay to add encoding for InfluxDB line protocol?

pront · 2024-02-12T14:07:41Z

Also, please add a changelog fragment here: https://github.com/vectordotdev/vector/tree/master/changelog.d

RedForza · 2024-03-20T08:32:15Z

Hello! Just stumbled onto this PR. I am very interested into Influx line protocol support for vector and looking at the issues list others are too.

@MichaHoffmann Do you have a timeline when the outstanding comments could be resolved? Appreciate the work done so far.

MichaHoffmann · 2024-07-02T12:29:28Z

Hello! Just stumbled onto this PR. I am very interested into Influx line protocol support for vector and looking at the issues list others are too.

@MichaHoffmann Do you have a timeline when the outstanding comments could be resolved? Appreciate the work done so far.

Ah I didnt continue to work on it because I had the impression that we cannot use kafka as datasource since it forces the type to be "log". Ever since I have changed employers and since this was work related, I didnt think about it much anymore. I dont want to squat this issue either, if someone wants to pick it up, thats would be perfect.

sebinsunny · 2024-07-06T12:12:56Z

Also, please add a changelog fragment here: https://github.com/vectordotdev/vector/tree/master/changelog.d

@pront added the changelog, please review this PR, thanks

jszwedko

Thanks for this contribution!

jszwedko · 2024-07-19T13:30:47Z

@sebinsunny thanks for iterating on this. A small request to try to avoid force pushing. It makes viewing the change history more difficult.

jszwedko · 2024-07-19T13:30:59Z

(the commits will all be squashed during merge)

sebinsunny · 2024-07-20T05:29:29Z

A small request to try to avoid force pushing. It makes viewing the change history more difficult.

@jszwedko Sorry about this. The pipeline was failing due to missing docs, but it has been fixed now. Hope pipeline will be green now.

pront · 2024-07-22T13:35:16Z

A small request to try to avoid force pushing. It makes viewing the change history more difficult.

@jszwedko Sorry about this. The pipeline was failing due to missing docs, but it has been fixed now. Hope pipeline will be green now.

👍 re-running CI checks

estherk15

Small nit: Decodes the raw bytes as an [Influxdb Line Protocol][influxdb] message.

estherk15 · 2024-07-22T13:43:39Z

website/cue/reference/components/sources/base/amqp.cue

@@ -108,6 +108,11 @@ base: components: sources: amqp: configuration: {
 															[gelf]: https://docs.graylog.org/docs/gelf
 															[implementation]: https://github.com/Graylog2/go-gelf/blob/v2/gelf/reader.go
 															"""
+						influxdb: """
+															Decodes the raw bytes as a [Influxdb Line Protocol][influxdb] message.


Suggested change

Decodes the raw bytes as a [Influxdb Line Protocol][influxdb] message.

Decodes the raw bytes as an [Influxdb Line Protocol][influxdb] message.

fixed the typo

sebinsunny · 2024-07-23T22:12:16Z

forced merge to resolve conflict

Signed-off-by: Michael Hoffmann <[email protected]> Co-authored-by: Sebin Sunny <[email protected]> * Changelog Signed-off-by: Sebin Sunny <[email protected]>

Signed-off-by: Sebin Sunny <[email protected]>

sebinsunny · 2024-07-23T23:02:31Z

Fixed the conflicts correctly. Earlier, the old reference for DeserializerConfig::Gelf(_) was added in DeserializerConfig::Bytes, which caused the test failure. I have now corrected the conflict.

github-actions · 2024-07-24T05:21:01Z

Regression Detector Results

Run ID: f0ecfe7a-d839-4b8d-ade3-2b580a904964 Metrics dashboard

Baseline: 787e1ec
Comparison: edbe7ca

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

Significant changes in experiment optimization goals

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

perf	experiment	goal	Δ mean %	Δ mean % CI	links
❌	splunk_hec_route_s3	ingress throughput	-5.57	[-5.88, -5.26]

Experiments ignored for regressions

Regressions in experiments with settings containing erratic: true are ignored.

perf	experiment	goal	Δ mean %	Δ mean % CI	links
❌	file_to_blackhole	egress throughput	-7.73	[-14.39, -1.08]

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI
➖	otlp_http_to_blackhole	ingress throughput	+2.90	[+2.72, +3.07]
➖	socket_to_socket_blackhole	ingress throughput	+1.78	[+1.71, +1.85]
➖	datadog_agent_remap_datadog_logs_acks	ingress throughput	+1.02	[+0.80, +1.23]
➖	otlp_grpc_to_blackhole	ingress throughput	+0.93	[+0.81, +1.05]
➖	http_to_s3	ingress throughput	+0.26	[-0.00, +0.53]
➖	datadog_agent_remap_datadog_logs	ingress throughput	+0.25	[+0.05, +0.45]
➖	syslog_splunk_hec_logs	ingress throughput	+0.06	[-0.03, +0.15]
➖	splunk_hec_to_splunk_hec_logs_acks	ingress throughput	+0.00	[-0.09, +0.10]
➖	splunk_hec_to_splunk_hec_logs_noack	ingress throughput	-0.02	[-0.12, +0.08]
➖	http_to_http_noack	ingress throughput	-0.02	[-0.14, +0.10]
➖	splunk_hec_indexer_ack_blackhole	ingress throughput	-0.03	[-0.12, +0.06]
➖	http_to_http_json	ingress throughput	-0.12	[-0.19, -0.06]
➖	datadog_agent_remap_blackhole	ingress throughput	-0.39	[-0.51, -0.28]
➖	syslog_loki	ingress throughput	-0.74	[-0.82, -0.67]
➖	http_elasticsearch	ingress throughput	-0.96	[-1.19, -0.73]
➖	http_to_http_acks	ingress throughput	-1.22	[-2.53, +0.10]
➖	syslog_regex_logs2metric_ddmetrics	ingress throughput	-1.33	[-1.51, -1.16]
➖	syslog_log2metric_tag_cardinality_limit_blackhole	ingress throughput	-1.51	[-1.61, -1.42]
➖	fluent_elasticsearch	ingress throughput	-1.63	[-2.14, -1.13]
➖	http_text_to_http_json	ingress throughput	-2.02	[-2.17, -1.88]
➖	syslog_log2metric_humio_metrics	ingress throughput	-2.05	[-2.18, -1.91]
➖	datadog_agent_remap_blackhole_acks	ingress throughput	-2.94	[-3.05, -2.84]
➖	syslog_humio_logs	ingress throughput	-3.06	[-3.18, -2.94]
➖	syslog_log2metric_splunk_hec_metrics	ingress throughput	-3.06	[-3.17, -2.95]
❌	splunk_hec_route_s3	ingress throughput	-5.57	[-5.88, -5.26]
❌	file_to_blackhole	egress throughput	-7.73	[-14.39, -1.08]

Explanation

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

github-actions · 2024-07-24T15:18:08Z

Regression Detector Results

Run ID: 3d125c8b-0cd2-4ec8-b31f-7357020832ba Metrics dashboard

Baseline: 56bd0dd
Comparison: 3fbb66c

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

No significant changes in experiment optimization goals

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.

Experiments ignored for regressions

Regressions in experiments with settings containing erratic: true are ignored.

perf	experiment	goal	Δ mean %	Δ mean % CI	links
❌	file_to_blackhole	egress throughput	-14.31	[-20.24, -8.38]

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI
➖	splunk_hec_route_s3	ingress throughput	+3.06	[+2.75, +3.37]
➖	syslog_splunk_hec_logs	ingress throughput	+3.05	[+2.92, +3.19]
➖	syslog_log2metric_splunk_hec_metrics	ingress throughput	+1.57	[+1.46, +1.67]
➖	syslog_log2metric_tag_cardinality_limit_blackhole	ingress throughput	+1.25	[+1.15, +1.34]
➖	datadog_agent_remap_datadog_logs	ingress throughput	+1.15	[+0.92, +1.38]
➖	http_elasticsearch	ingress throughput	+0.92	[+0.73, +1.12]
➖	http_text_to_http_json	ingress throughput	+0.79	[+0.67, +0.91]
➖	http_to_s3	ingress throughput	+0.79	[+0.53, +1.05]
➖	fluent_elasticsearch	ingress throughput	+0.78	[+0.28, +1.29]
➖	splunk_hec_indexer_ack_blackhole	ingress throughput	+0.06	[-0.02, +0.14]
➖	splunk_hec_to_splunk_hec_logs_noack	ingress throughput	+0.01	[-0.09, +0.11]
➖	splunk_hec_to_splunk_hec_logs_acks	ingress throughput	+0.01	[-0.12, +0.13]
➖	http_to_http_noack	ingress throughput	-0.08	[-0.19, +0.04]
➖	syslog_log2metric_humio_metrics	ingress throughput	-0.08	[-0.26, +0.10]
➖	http_to_http_json	ingress throughput	-0.14	[-0.22, -0.07]
➖	otlp_grpc_to_blackhole	ingress throughput	-0.69	[-0.81, -0.56]
➖	http_to_http_acks	ingress throughput	-0.75	[-2.07, +0.56]
➖	otlp_http_to_blackhole	ingress throughput	-0.81	[-0.97, -0.65]
➖	datadog_agent_remap_blackhole_acks	ingress throughput	-0.90	[-1.00, -0.79]
➖	datadog_agent_remap_blackhole	ingress throughput	-0.97	[-1.07, -0.87]
➖	syslog_loki	ingress throughput	-1.35	[-1.43, -1.28]
➖	datadog_agent_remap_datadog_logs_acks	ingress throughput	-1.38	[-1.55, -1.20]
➖	syslog_humio_logs	ingress throughput	-2.23	[-2.35, -2.11]
➖	socket_to_socket_blackhole	ingress throughput	-2.73	[-2.80, -2.66]
➖	syslog_regex_logs2metric_ddmetrics	ingress throughput	-3.20	[-3.37, -3.03]
❌	file_to_blackhole	egress throughput	-14.31	[-20.24, -8.38]

Explanation

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

* feat(codec,sources): influxdb line protocol decoder Signed-off-by: Michael Hoffmann <[email protected]> Co-authored-by: Sebin Sunny <[email protected]> * Changelog Signed-off-by: Sebin Sunny <[email protected]> * chore(docs): update doc about InfluxDB Decoder Signed-off-by: Sebin Sunny <[email protected]> * chore(docs): fix typo about InfluxDB Decoder Signed-off-by: Sebin Sunny <[email protected]> --------- Signed-off-by: Sebin Sunny <[email protected]> Co-authored-by: Sebin Sunny <[email protected]>

MichaHoffmann force-pushed the mhoffm-influx-line-protocol-decoder branch from a85c402 to 7d57649 Compare January 17, 2024 20:29

MichaHoffmann changed the title ~~WIP: influxdb line protcol decoder~~ feat(codec,sources): influxdb line protcol decoder Jan 17, 2024

MichaHoffmann marked this pull request as ready for review January 18, 2024 08:06

jszwedko requested a review from pront February 6, 2024 18:09

pront reviewed Feb 12, 2024

View reviewed changes

RedForza mentioned this pull request Jul 2, 2024

influxdb codec #11337

Closed

sebinsunny force-pushed the mhoffm-influx-line-protocol-decoder branch 6 times, most recently from 8d03b7b to 89b6361 Compare July 6, 2024 12:10

pront self-requested a review July 8, 2024 14:24

sebinsunny force-pushed the mhoffm-influx-line-protocol-decoder branch from 89b6361 to 7431061 Compare July 16, 2024 23:34

pront approved these changes Jul 17, 2024

View reviewed changes

jszwedko approved these changes Jul 17, 2024

View reviewed changes

jszwedko enabled auto-merge July 17, 2024 16:12

auto-merge was automatically disabled July 18, 2024 00:05
Head branch was pushed to by a user without write access

sebinsunny force-pushed the mhoffm-influx-line-protocol-decoder branch from 7431061 to b929192 Compare July 18, 2024 00:05

jszwedko enabled auto-merge July 18, 2024 14:10

auto-merge was automatically disabled July 19, 2024 02:32
Head branch was pushed to by a user without write access

sebinsunny force-pushed the mhoffm-influx-line-protocol-decoder branch from b929192 to f607357 Compare July 19, 2024 02:32

jszwedko enabled auto-merge July 19, 2024 13:30

auto-merge was automatically disabled July 20, 2024 05:25
Head branch was pushed to by a user without write access

sebinsunny requested review from a team as code owners July 20, 2024 05:25

github-actions bot added the domain: external docs Anything related to Vector's external, public documentation label Jul 20, 2024

estherk15 approved these changes Jul 22, 2024

View reviewed changes

sebinsunny force-pushed the mhoffm-influx-line-protocol-decoder branch from b3ecbde to 29a8e7c Compare July 23, 2024 22:10

jszwedko enabled auto-merge July 23, 2024 22:19

MichaHoffmann and others added 3 commits July 24, 2024 08:48

feat(codec,sources): influxdb line protocol decoder

d17c30f

Signed-off-by: Michael Hoffmann <[email protected]> Co-authored-by: Sebin Sunny <[email protected]> * Changelog Signed-off-by: Sebin Sunny <[email protected]>

chore(docs): update doc about InfluxDB Decoder

2f31c70

Signed-off-by: Sebin Sunny <[email protected]>

chore(docs): fix typo about InfluxDB Decoder

6302a7b

Signed-off-by: Sebin Sunny <[email protected]>

auto-merge was automatically disabled July 23, 2024 22:54
Head branch was pushed to by a user without write access

sebinsunny force-pushed the mhoffm-influx-line-protocol-decoder branch from 29a8e7c to 6302a7b Compare July 23, 2024 22:54

jszwedko enabled auto-merge July 24, 2024 00:29

jszwedko added this pull request to the merge queue Jul 24, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jul 24, 2024

jszwedko added this pull request to the merge queue Jul 24, 2024

Merged via the queue into vectordotdev:master with commit 3fbb66c Jul 24, 2024
49 checks passed

jorgehermo9 mentioned this pull request Aug 5, 2024

Add parse_influxdb function vectordotdev/vrl#976

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(codec,sources): influxdb line protcol decoder #19637

feat(codec,sources): influxdb line protcol decoder #19637

MichaHoffmann commented Jan 17, 2024 •

edited

Loading

pront left a comment

pront Feb 12, 2024

sebinsunny Jul 6, 2024

pront Jul 12, 2024

sebinsunny Jul 17, 2024

pront commented Feb 12, 2024

RedForza commented Mar 20, 2024 •

edited

Loading

MichaHoffmann commented Jul 2, 2024

sebinsunny commented Jul 6, 2024

jszwedko left a comment

jszwedko commented Jul 19, 2024

jszwedko commented Jul 19, 2024

sebinsunny commented Jul 20, 2024 •

edited

Loading

pront commented Jul 22, 2024

estherk15 left a comment

estherk15 Jul 22, 2024

sebinsunny Jul 23, 2024

sebinsunny commented Jul 23, 2024 •

edited

Loading

sebinsunny commented Jul 23, 2024

github-actions bot commented Jul 24, 2024

Experiments ignored for regressions

Fine details of change detection per experiment

Explanation

github-actions bot commented Jul 24, 2024

Experiments ignored for regressions

Fine details of change detection per experiment

Explanation

	Decodes the raw bytes as a [Influxdb Line Protocol][influxdb] message.
	Decodes the raw bytes as an [Influxdb Line Protocol][influxdb] message.

feat(codec,sources): influxdb line protcol decoder #19637

feat(codec,sources): influxdb line protcol decoder #19637

Conversation

MichaHoffmann commented Jan 17, 2024 • edited Loading

pront left a comment

Choose a reason for hiding this comment

pront Feb 12, 2024

Choose a reason for hiding this comment

sebinsunny Jul 6, 2024

Choose a reason for hiding this comment

pront Jul 12, 2024

Choose a reason for hiding this comment

sebinsunny Jul 17, 2024

Choose a reason for hiding this comment

pront commented Feb 12, 2024

RedForza commented Mar 20, 2024 • edited Loading

MichaHoffmann commented Jul 2, 2024

sebinsunny commented Jul 6, 2024

jszwedko left a comment

Choose a reason for hiding this comment

jszwedko commented Jul 19, 2024

jszwedko commented Jul 19, 2024

sebinsunny commented Jul 20, 2024 • edited Loading

pront commented Jul 22, 2024

estherk15 left a comment

Choose a reason for hiding this comment

estherk15 Jul 22, 2024

Choose a reason for hiding this comment

sebinsunny Jul 23, 2024

Choose a reason for hiding this comment

sebinsunny commented Jul 23, 2024 • edited Loading

sebinsunny commented Jul 23, 2024

github-actions bot commented Jul 24, 2024

Regression Detector Results

Significant changes in experiment optimization goals

Experiments ignored for regressions

Fine details of change detection per experiment

Explanation

github-actions bot commented Jul 24, 2024

Regression Detector Results

No significant changes in experiment optimization goals

Experiments ignored for regressions

Fine details of change detection per experiment

Explanation

MichaHoffmann commented Jan 17, 2024 •

edited

Loading

RedForza commented Mar 20, 2024 •

edited

Loading

sebinsunny commented Jul 20, 2024 •

edited

Loading

sebinsunny commented Jul 23, 2024 •

edited

Loading