Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitoring integration #272

Open
wants to merge 15 commits into
base: develop
Choose a base branch
from
Open

Monitoring integration #272

wants to merge 15 commits into from

Conversation

rpanic
Copy link
Member

@rpanic rpanic commented Feb 4, 2025

This PR implements a monitoring solution for protokit.

The architecture consists of the following components:

  • OpenTelemetry exporters for framework metrics and traces. This implementation can be found in the @proto-kit/api package and is configurable
  • The metrics from above's exporter are exposed via and http endpoint for consumption by a Prometheus service.
  • The traces are pushed to a OTLP Collector which then automatically forwards it to a Tempo instance.
  • Additionally, Promtail is configured to read logs from all containers with a certain label configuration (see below). Promtail then pushes those logs to a Loki container.
  • A Grafana instance is configured with all the above data sources for display:

All docker-related files with default configuration of all services can be found in packages/deployment/docker/monitoring.

Metrics: Prometheus

The SequencerModule OpenTelemetryServer is the base module for all things regarding metrics and traces. It receives a configuration for both functionalities.
For metrics, the configuration should reflect the endpoint that this module should open up for prometheus to call into.
Additionally, this endpoint has to be reachable from the prometheus container and be configured as a scrape_config in prometheus.yml.

Note: Currently, Prometheus can currently only pick up metrics when the sequencer is run inside docker.

Traces: Otel-collector & Tempo

Traces work rather out-of-the-box by starting otel-collector and tempo, and then configuring the OpenTelemetryServer to use the otel-collector grpc endpoint. By default, this is localhost:4318 (as to the port remapping in docker-compose.yml). When running protokit inside a container, the hostname will obviously then be otel-collector instead of localhost.

The role of the otel-collector is just to be a proxy to relay the data and add stability to the data pipeline (retries, etc).
Tempo is the actual trace database that grafana talks to.

Logs: Promtail & Loki

Configure your container with the following docker-compose config to attach labels to the container:

labels:
  logging: "promtail"
  logging_jobname: "grafana"

Note: Currently, Promtail can currently only pick up logs when the sequencer is run inside docker.

Things left to do:

  • Modularize the instrumentations for the metrics export
  • Persistence of the monitoring service's data

How to start & test in framework

  • docker compose --profile monitoring up --force-recreate
  • run jest test in packages/stack/test/start.test.ts

@rpanic rpanic changed the base branch from feature/st-prover-3 to develop February 13, 2025 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants