Skip to content

Commit

Permalink
Merge branch 'opentelemtry' into 'main'
Browse files Browse the repository at this point in the history
Prepare OTEL support for imalive

See merge request oss/imalive!10
  • Loading branch information
idrissneumann committed Feb 28, 2024
2 parents d65e176 + b345524 commit e68cf29
Show file tree
Hide file tree
Showing 21 changed files with 287 additions and 49 deletions.
25 changes: 25 additions & 0 deletions .docker/otel-collector-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
receivers:
otlp:
protocols:
grpc:
http:

exporters:
debug:
prometheus:
endpoint: "0.0.0.0:8889"
const_labels:
otel: otel
otlp:
endpoint: "jaeger:4317"
tls:
insecure: true

service:
pipelines:
metrics:
receivers: [otlp]
exporters: [prometheus]
traces:
receivers: [otlp]
exporters: [otlp]
12 changes: 12 additions & 0 deletions .docker/prometheus.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
global:
scrape_interval: 10s

scrape_configs:
- job_name: 'imalive'
static_configs:
- targets: ['imalive-api:8080']
metrics_path: '/v1/prom'
scheme: http
- job_name: 'opentelemetry'
static_configs:
- targets: ['otel-collector:8889']
1 change: 1 addition & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,4 @@ SLACK_TRIGGER=
DISCORD_TRIGGER=
WARNING_THRESHOLD=80
ERROR_THRESHOLD=90
OTEL_COLLECTOR_ENDPOINT="otel-collector:4317"
4 changes: 2 additions & 2 deletions .gitlab-ci.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
stages:
- publish
- deliver
- deploy
- test
- arm

mirror:
stage: publish
Expand Down Expand Up @@ -31,7 +31,7 @@ api_x86:
- imalive

api_arm:
stage: deliver
stage: arm
script:
- setsid ./ci/docker-deliver.sh "arm" "imalive-api"
only:
Expand Down
5 changes: 5 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,8 @@ You can also run unit tests by running this command:
```shell
docker-compose -f docker-compose-local.yml up --build --abort-on-container-exit imalive-tests
```

Then you can try:
* jaegger UI here for the traces: http://localhost:16686
* the opentelemetry metrics exporter endpoint: http://localhost:8889/metrics
* prometheus: http://localhost:9090
3 changes: 2 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,8 @@ WORKDIR /app

COPY requirements.txt /app/requirements.txt

RUN apk add --no-cache --virtual .build-deps gcc musl-dev linux-headers && \
RUN apk add --no-cache libstdc++ && \
apk add --no-cache --virtual .build-deps gcc g++ musl-dev linux-headers && \
pip install --upgrade pip && \
pip install -r requirements.txt && \
apk del .build-deps
Expand Down
72 changes: 69 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,6 @@ Just a dummy healthcheck api for your nodes (support x86 and armhf for raspberry

It provide a http/restful endpoint that you can use as a healthcheck rule to your loadbalancer and also publish a heartbit in stdout (usefull if you collect it in a log/alerting management system such as elasticstack).

BTW we're also providing packages and images for elasticstack (Kibana, Elasticsearch, Filebeat) [here](https://gitlab.comwork.io/oss/elasticstack).

![kibana](./img/kibana.png)

## Table of content
Expand Down Expand Up @@ -121,7 +119,7 @@ $ curl localhost:8080/v1/metrics

### Metrics for prometheus

If you want to use `imalive` as a metrics exporter, this is the way:
If you want to use `imalive` as a Prometheus metrics exporter, this is the way:

```shell
$ curl localhost:8080/v1/prom
Expand All @@ -146,6 +144,20 @@ disk_total 56.096561431884766
# HELP imalive_imalive_http_reques
```

Here's an example of Prometheus config for scraping the data:

```yaml
global:
scrape_interval: 10s

scrape_configs:
- job_name: 'imalive'
static_configs:
- targets: ['imalive-api:8080']
metrics_path: '/v1/prom'
scheme: http
```
## Heartbit
You can change the wait time between two heartbit with the `WAIT_TIME` environment variable (in seconds).
Expand All @@ -162,6 +174,60 @@ You can change `anode` by your node name with the `IMALIVE_NODE_NAME` environmen

You also can log only a json output by making the environment variable `LOG_FORMAT` equal "json".

## OpenTelemetry

You can also configure an OTEL Grpc endpoint using the `OTEL_COLLECTOR_ENDPOINT` environment variable.

Imalive is sending metrics and traces through GRPC OTLP, you'll be able to see your traces on Jaegger like this:

![jaegger](./img/jaegger.png)

And your metrics on Prometheus like this:

![prometheus](./img/prometheus.png)

Here's an example of Prometheus configuration for scrapping the opentelemetry collector metrics:

```yaml
global:
scrape_interval: 10s
scrape_configs:
- job_name: 'opentelemetry'
static_configs:
- targets: ['otel-collector:8889']
```

And the opentelemetry collector configuration as well for receiving the traces and metrics from imalive:

```yaml
receivers:
otlp:
protocols:
grpc:
http:
exporters:
debug:
prometheus:
endpoint: "0.0.0.0:8889"
const_labels:
otel: otel
otlp:
endpoint: "jaeger:4317"
tls:
insecure: true
service:
pipelines:
metrics:
receivers: [otlp]
exporters: [prometheus]
traces:
receivers: [otlp]
exporters: [otlp]
```

## Development / contributions

Go see this [documentation](./CONTRIBUTING.md)
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
3.5.2
3.6.0
45 changes: 44 additions & 1 deletion docker-compose-local.yml
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
version: "3.3"
version: "3.9"

services:
imalive-api:
restart: always
image: imalive-api:latest
container_name: imalive-api
build:
context: .
dockerfile: ./Dockerfile
Expand All @@ -12,8 +13,50 @@ services:
- .env
ports:
- "8080:8080"
networks:
- imalive-net
otel-collector:
restart: always
image: otel/opentelemetry-collector:latest
container_name: otel-collector
command: ["--config=/etc/otel-collector-config.yaml"]
volumes:
- .docker/otel-collector-config.yaml:/etc/otel-collector-config.yaml
ports:
- "1888:1888" # pprof extension
- "8888:8888" # Prometheus metrics exposed by the collector
- "8889:8889" # Prometheus exporter metrics
- "13133:13133" # health_check extension
- "4317:4317" # OTLP gRPC receiver
- "4318:4318" # OTLP gRPC receiver
- "55679:55679" # zpages extension
depends_on:
- jaeger
networks:
- imalive-net
jaeger:
restart: always
image: jaegertracing/all-in-one:latest
container_name: jaeger
ports:
- "16686:16686"
networks:
- imalive-net
prometheus:
image: prom/prometheus
container_name: prometheus
ports:
- 9090:9090
volumes:
- .docker/prometheus.yml:/etc/prometheus/prometheus.yml
networks:
- imalive-net
restart: always
imalive-tests:
build:
context: .
dockerfile: ./Dockerfile
target: unit_tests

networks:
imalive-net:
Binary file added img/jaegger.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/prometheus.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 5 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,8 @@ requests
uvicorn[standard]
fastapi-utils
psutil
prometheus-fastapi-instrumentator
prometheus-fastapi-instrumentator
opentelemetry-api
opentelemetry-sdk
opentelemetry-instrumentation-fastapi
opentelemetry-exporter-otlp
9 changes: 7 additions & 2 deletions src/main.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
import asyncio

from fastapi import FastAPI
from prometheus_fastapi_instrumentator import Instrumentator
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor

from restful_ressources import import_ressources

from utils.common import is_not_empty
from utils.manifests import get_manifest_as_dict
from utils.heartbit import heartbit
from utils.otel import init_otel_tracer, init_otel_metrics

version = "unkown"
manifest = get_manifest_as_dict()
Expand All @@ -24,10 +24,15 @@

instrumentator = Instrumentator()

init_otel_tracer()
init_otel_metrics()

heartbit()

instrumentator.instrument(app, metric_namespace='imalive', metric_subsystem='imalive')
instrumentator.expose(app, endpoint='/v1/prom')
instrumentator.expose(app, endpoint='/prom')

FastAPIInstrumentor.instrument_app(app)

import_ressources(app)
10 changes: 7 additions & 3 deletions src/routes/api_health.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,16 @@
from utils.health import health
from fastapi import APIRouter

from utils.otel import get_otel_tracer
from utils.health import health

router = APIRouter()

@router.get("")
def get_health():
return health()
with get_otel_tracer().start_as_current_span("imalive-health-get-route"):
return health()

@router.post("")
def post_health():
return health()
with get_otel_tracer().start_as_current_span("imalive-health-post-route"):
return health()
13 changes: 8 additions & 5 deletions src/routes/api_manifest.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,17 @@
from fastapi import APIRouter
from fastapi.responses import JSONResponse

from utils.otel import get_otel_tracer
from utils.manifests import get_manifest_as_dict

router = APIRouter()

@router.get("")
def get_manifest():
manifest = get_manifest_as_dict()
with get_otel_tracer().start_as_current_span("imalive-manifest-route"):
manifest = get_manifest_as_dict()

if manifest['status'] == 'error':
return JSONResponse(content=manifest, status_code=500)
else:
return manifest
if manifest['status'] == 'error':
return JSONResponse(content=manifest, status_code=500)
else:
return manifest
7 changes: 5 additions & 2 deletions src/routes/api_metrics.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
from utils.metrics import all_metrics
from fastapi import APIRouter

from utils.otel import get_otel_tracer
from utils.metrics import all_metrics

router = APIRouter()

@router.get("")
def get_metrics():
return all_metrics()
with get_otel_tracer().start_as_current_span("imalive-metrics-route"):
return all_metrics()
7 changes: 5 additions & 2 deletions src/routes/api_root.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
from utils.health import health
from fastapi import APIRouter

from utils.otel import get_otel_tracer
from utils.health import health

router = APIRouter()

@router.get("/")
def get_root():
return health()
with get_otel_tracer().start_as_current_span("imalive-root-route"):
return health()
37 changes: 37 additions & 0 deletions src/utils/gauge.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
import re

from prometheus_client import Gauge
from opentelemetry.metrics import Observation

from utils.otel import get_otel_meter

_numeric_value_pattern = r"-?\d+\.\d+"
_current_gauge_values = {}

def create_gauge(name, description):
_current_gauge_values[name] = {
'val': 0.0,
'desc': description
}

def observable_gauge_func(_):
yield Observation(_current_gauge_values[name]['val'])

get_otel_meter().create_observable_gauge(
name = name,
description = description,
callbacks=[observable_gauge_func]
)

return Gauge(
name,
description
)

def set_gauge(gauge, value):
match = re.search(_numeric_value_pattern, "{}".format(value))

if match:
val = float(match.group())
gauge.set(val)
_current_gauge_values[gauge._name]['val'] = val
Loading

0 comments on commit e68cf29

Please sign in to comment.