Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vector V0.35.0 does not seem to be recognizing the all_metrics field in the log_to_metric transform and gives an error upon its inclusion. #19633

Closed
a-patos opened this issue Jan 16, 2024 · 9 comments
Labels
type: bug A code related bug.

Comments

@a-patos
Copy link

a-patos commented Jan 16, 2024

A note for the community

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request

If you are interested in working on this issue or have submitted a pull request, please leave a comment

Problem

Vector V0.35.0 does not seem to be recognizing the all_metrics field in the log_to_metric transform and gives an error upon its inclusion.

Configuration

transforms:
  log_2_met:
    inputs:
      - kafka
    type: log_to_metric
    metrics: ignored
    all_metrics: true

Version

vector 0.35.0 (aarch64-unknown-linux-gnu e57c0c0 2024-01-08 14:42:10.103908779)

Debug Output

2024-01-16T23:00:31.023842Z DEBUG vector::app: Internal log rate limit configured. internal_log_rate_secs=10
2024-01-16T23:00:31.023886Z  INFO vector::app: Log level is enabled. level="trace"
2024-01-16T23:00:31.023942Z DEBUG vector::app: messaged="Building runtime." worker_threads=2
2024-01-16T23:00:31.024034Z TRACE mio::poll: registering event source with poller: token=Token(1), interests=READABLE
2024-01-16T23:00:31.024341Z  INFO vector::app: Loading configs. paths=["/etc/vector/vector.yaml"]
2024-01-16T23:00:31.025194Z ERROR vector::cli: Configuration error. error=invalid type: string "ignored", expected a sequence



in `transforms.log_2_met`

Example Data

No response

Additional Context

Hello, I'm trying to implement a Kafka bus that centralizes all my metric collects from several Prometheus, and I need to route my metrics to different backends.
No problem to do the source.prometheus_remote_write to sinks.kafka,

but the reverse doesn't seem to be possible without this all_metrics, if I understood correctly.

References

#19625

@jszwedko
Copy link
Member

@a-patos I think you want to leave out the metrics: "ignored" bit.

Also, if the goal is just to send metrics to Kafka, that can be achieved by configuring encoding.codec to either native_json or native. These will send the metrics as either JSON or protobuf (respectively) and can be decoded on the other side via decoding.codec. https://vector.dev/highlights/2022-03-31-native-event-codecs/ describes this in more detail.

@a-patos
Copy link
Author

a-patos commented Jan 17, 2024

My goal is to retrieve the metrics stored in Kafka topics and send them to different Prometheus instances.

Sending metrics is working with this configuration:
`
system:
type: "prometheus_remote_write"
address: "0.0.0.0:8443"

metric:
type: "kafka"
inputs: ["system"]
topic: "metrics"
bootstrap_servers: "kafka:9096"
encoding:
codec: "json"
compression: "snappy"
tls:
enabled: true
sasl:
enabled: true
mechanism: "SCRAM-SHA-512"
username: "reader"
`

but the opposite don't work :(

@jszwedko
Copy link
Member

I'm a little confused since:

  • The prometheus_remote_write should be emitting metric events and not log events which would be incompatible with the kafka sink as you have it configured
  • when: '!((.ST.exists() && .Composant.exists()))' is not valid Vector configuration for the filter transform

Is this Vector configuration? Or configuration for another tool?

@a-patos
Copy link
Author

a-patos commented Jan 17, 2024

@jszwedko :
We need to forget about the filter for now. I corrected my previous post (I copied the wrong file).
For me, the Prometheus remote write source is sending data to Kafka correctly, without any problem.
Exemple of Event in Kafka :

{
"name": "node_boot_time_seconds",
"tags": {
"agent_hostname": "i-XXXX",
"cartographie": "QRU",
"composant": "Vector_Read",
"host": "i-XXX",
"instance": "i-XXXX:8090",
"job": "integrations/node_exporter",
"st": "TOOLINS"
},
"timestamp": "2024-01-17T23:31:12.185Z",
"kind": "absolute",
"gauge": {
"value": 1705533853.0
}
}

I attached a small schema for your reference
Vector_metrics

In my schema, it is the "Vector Reader" part that is causing an error :("

with Vector ReaderConfig: :

sources:
kafka:
type: "kafka"
group_id: "test"
topics:
- "metrics"
bootstrap_servers: "kafka:9096"

sinks:
to_Prometheus:
type: prometheus_remote_write
inputs:
- kafka
endpoint: https://XXXX:8443/api/v1/push

i got :
2024-01-17T23:39:10.787812Z ERROR vector::cli: Configuration error. error=Data type mismatch between kafka (Log) and to_Prometheus (Metric)

with Vector ReaderConfig: :

sources:
kafka:
type: "kafka"
group_id: "test"
topics:
- "metrics"
bootstrap_servers: "kafka:9096"

transforms:
log_2_met:
inputs:
- kafka
type: log_to_metric
all_metrics: true
sinks:
to_Prometheus:
type: prometheus_remote_write
inputs:
- log_2_met
endpoint: https://XXXX:8443/api/v1/push

i got :
2024-01-17T23:50:32.639902Z ERROR vector::cli: Configuration error. error=missing field metrics

with Vector ReaderConfig: :

sources:
kafka:
type: "kafka"
group_id: "test"
topics:
- "metrics"
bootstrap_servers: "kafka:9096"

transforms:
log_2_met:
inputs:
- kafka
type: log_to_metric
transforms:
log_2_met:
inputs:
- kafka
type: log_to_metric
metrics:
field: "name"
all_metrics: true
sinks:
to_Prometheus:
type: prometheus_remote_write
inputs:
- log_2_met
endpoint: https://XXXX:8443/api/v1/push

i got :
2024-01-17T23:58:16.647608Z ERROR vector::cli: Configuration error. error=invalid type: map, expected a sequence

@jszwedko
Copy link
Member

Thanks @a-patos . I'd recommend taking a look at https://vector.dev/highlights/2022-03-31-native-event-codecs/#sending-events-between-vector-instances . It discusses basically this exact use-case. The recommendation is to use the native codec for the sender and receiver.

@jszwedko
Copy link
Member

Which would mean removing the log_to_metric transform and any metric_to_log transform.

@a-patos
Copy link
Author

a-patos commented Jan 18, 2024

Oh great! It works perfectly, a big thank you for your help! 🥇

Can I ask one more thing, Is it possible for me to implement a number of controls on the labels before sending the metrics?
For example:
if the ST label is missing, we drop it,
if the Job label is equal to "mycustomjob" , we drop it,
if the name is not in the list ["up", "cpu", "ram"], we drop it,
if the label "goto_siem" is equal to True, we send the metrics to the SIEM topic and Metrics topic .

@jszwedko
Copy link
Member

if the ST label is missing, we drop it,
if the Job label is equal to "mycustomjob" , we drop it,
if the name is not in the list ["up", "cpu", "ram"], we drop it,

A filter transform should work for these

if the label "goto_siem" is equal to True, we send the metrics to the SIEM topic and Metrics topic .

A route transform would work for this

I'll close this out since I think the issue is resolved, but feel free to open GitHub Discussions if you have additional usage questions!

@genadipost
Copy link
Contributor

@a-patos Could you share a working configuration of both Vector Agents?
I am trying to do something similar:

Prometheus <-- Vector Agent -> GCP PubSub <-- Vector Aggregator (Prometheus exporter sink) <-- Prometheus

  1. Vector agent scrapes Prometheus server
  2. Vector agent publishes the metrics to GCP PubSub
  3. Vector Aggregator reads metrics from pubsub
  4. Vector Aggregator sinks data via Prometheus exporter
  5. Prometheus server scrapes Vector Aggregator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug A code related bug.
Projects
None yet
Development

No branches or pull requests

3 participants