Skip to content

Commit

Permalink
cardano-tracer: update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
mgmeier committed Sep 26, 2024
1 parent b9120e3 commit f346ad6
Show file tree
Hide file tree
Showing 3 changed files with 91 additions and 40 deletions.
9 changes: 8 additions & 1 deletion cardano-tracer/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,17 @@
# ChangeLog

## 0.3 (September 20, 2024)
## 0.3 (September 26, 2024)

* Abondon `snap` webserver in favour of `wai`/`warp` for Prometheus and EKG Monitoring.
* Add dynamic routing to EKG stores of all connected nodes.
* Derive URL compliant routes from connected node names (instead of plain node names).
* Remove the requirement of two distinct ports for the EKG backend (changing `hasEKG` config type).
* Improved OpenMetrics compliance of Prometheus exposition; also addresses [issue#5140][i5140].
* Prometheus help annotations can be provided via the new optional config value `metricsHelp`.
* For optional RTView component only: Disable SSL/https connections. Force `snap-server`
dependency to build with `-flag -openssl`.
* Add JSON responses when listing connected nodes for both Prometheus and EKG Monitoring.
* Fix: actually send `forHuman` rendering output to journald when specified.
* Add consistency check for redundant port values in the config.

## 0.2.4 (August 13, 2024)
Expand Down Expand Up @@ -48,3 +51,7 @@
## 0.1.0

Initial version.



[i5140]: https://github.com/IntersectMBO/cardano-node/issues/5140
82 changes: 58 additions & 24 deletions cardano-tracer/docs/cardano-tracer.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,21 +4,24 @@

# Contents

1. [Introduction](#Introduction)
1. [Motivation](#Motivation)
3. [Overview](#Overview)
2. [Build and run](#Build-and-run)
3. [Configuration](#Configuration)
1. [Distributed Scenario](#Distributed-scenario)
2. [Local Scenario](#Local-scenario)
3. [Network Magic](#Network-magic)
4. [Requests](#Requests)
5. [Logging](#Logging)
6. [Logs Rotation](#Logs-rotation)
7. [Prometheus](#Prometheus)
8. [EKG Monitoring](#EKG-monitoring)
9. [Verbosity](#Verbosity)
10. [RTView](#RTView)
- [Cardano Tracer](#cardano-tracer)
- [Contents](#contents)
- [Introduction](#introduction)
- [Motivation](#motivation)
- [Overview](#overview)
- [Build and run](#build-and-run)
- [Configuration](#configuration)
- [Distributed Scenario](#distributed-scenario)
- [Important](#important)
- [Local Scenario](#local-scenario)
- [Network Magic](#network-magic)
- [Requests](#requests)
- [Logging](#logging)
- [Logs Rotation](#logs-rotation)
- [Prometheus](#prometheus)
- [EKG Monitoring](#ekg-monitoring)
- [Verbosity](#verbosity)
- [RTView](#rtview)

# Introduction

Expand Down Expand Up @@ -390,20 +393,51 @@ $ curl --silent -H "Accept: application/json" '127.0.0.1:3200' | jq '.'
}
```

The Promethus output is a map from Prometheus metric to value:
Prometheus uses the text-based exposition format, complete with `# TYPE` and `# HELP` annotations. The latter ones have to be provided by the `metricsHelp` config value (see below).

The output should be [OpenMetrics](https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#text-format) compliant. Example snippet:

```
$ curl '127.0.0.1:3200/12700130004'
blockNum_int 35
rts_gc_init_cpu_ms 5
rts_gc_par_tot_bytes_copied 0
served_block_counter 31
submissions_accepted_counter 2771
density_real 5.7692307692307696e-2
blocksForged_int 6
# TYPE Mem_resident_int gauge
# HELP Mem_resident_int Kernel-reported RSS (resident set size)
Mem_resident_int 103792640
# TYPE rts_gc_max_bytes_used gauge
rts_gc_max_bytes_used 5811512
# TYPE rts_gc_gc_cpu_ms counter
rts_gc_gc_cpu_ms 50
# TYPE RTS_gcMajorNum_int gauge
# HELP RTS_gcMajorNum_int Major GCs
RTS_gcMajorNum_int 4
# TYPE rts_gc_par_avg_bytes_copied gauge
rts_gc_par_avg_bytes_copied 0
# TYPE rts_gc_num_bytes_usage_samples counter
rts_gc_num_bytes_usage_samples 4
# TYPE remainingKESPeriods_int gauge
remainingKESPeriods_int 62
# TYPE rts_gc_bytes_copied counter
rts_gc_bytes_copied 17114384
# TYPE nodeCannotForge_int gauge
# HELP nodeCannotForge_int How many times was this node unable to forge [a block]?
# EOF
```

Passing metric help annotations to the service can be done in the config file, either as a key-value map from metric name to help text, or as a seperate JSON file containing such a map.
The system's internal metric names have to be used as keys (cf. [metrics documentation](https://github.com/input-output-hk/cardano-node-wiki/blob/main/docs/new-tracing/tracers_doc_generated.md#metrics)).
```
"metricsHelp": "path/to/key-value-map.json"
```
or
```
"metricsHelp": {
"Mem.resident": "Kernel-reported RSS (resident set size)",
"RTS.gcMajorNum": "Major GCs",
"nodeCannotForge": "How many times was this node unable to forge [a block]?"
}
```



## EKG Monitoring

At top-level route `/` EKG gives a list of connected nodes.
Expand Down
40 changes: 25 additions & 15 deletions cardano-tracer/src/Cardano/Tracer/Handlers/Metrics/Prometheus.hs
Original file line number Diff line number Diff line change
Expand Up @@ -35,24 +35,34 @@ import qualified System.Metrics as EKG
import System.Metrics (Sample, Value (..), sampleAll)
import System.Time.Extra (sleep)

-- | Runs simple HTTP server that listens host and port and returns
-- the list of currently connected nodes in such a format:
-- | Runs a simple HTTP server that listens on @endpoint@.
--
-- * relay-1
-- * relay-2
-- * core-1
-- At the root, it lists the connected nodes, either as HTML or JSON, depending
-- on the requests 'Accept: ' header.
--
-- where 'relay-1', 'relay-2' and 'core-1' are nodes' names.
-- Routing is dynamic, depending on the connected nodes. A valid URL is derived
-- from the nodeName configured for the connecting node. E.g. a node name
-- of `127.0.0.1:30004` will result in the route `/12700130004` which
-- renders that node's Prometheus / OpenMetrics text exposition:
--
-- Each of list items is a href. By clicking on it, the user will be
-- redirected to the page with the list of metrics received from that node,
-- in such a format:
--
-- rts_gc_par_tot_bytes_copied 0
-- rts_gc_num_gcs 17
-- rts_gc_max_bytes_slop 15888
-- rts_gc_bytes_copied 165952
-- ekg_server_timestamp_ms 1639569439623
-- # TYPE Mem_resident_int gauge
-- # HELP Mem_resident_int Kernel-reported RSS (resident set size)
-- Mem_resident_int 103792640
-- # TYPE rts_gc_max_bytes_used gauge
-- rts_gc_max_bytes_used 5811512
-- # TYPE rts_gc_gc_cpu_ms counter
-- rts_gc_gc_cpu_ms 50
-- # TYPE RTS_gcMajorNum_int gauge
-- # HELP RTS_gcMajorNum_int Major GCs
-- RTS_gcMajorNum_int 4
-- # TYPE rts_gc_num_bytes_usage_samples counter
-- rts_gc_num_bytes_usage_samples 4
-- # TYPE remainingKESPeriods_int gauge
-- remainingKESPeriods_int 62
-- # TYPE rts_gc_bytes_copied counter
-- rts_gc_bytes_copied 17114384
-- # TYPE nodeCannotForge_int gauge
-- # HELP nodeCannotForge_int How many times was this node unable to forge [a block]?
--
runPrometheusServer
:: TracerEnv
Expand Down

0 comments on commit f346ad6

Please sign in to comment.