Skip to content

Commit

Permalink
docs(adr-009): cover metrics and more info on tracing
Browse files Browse the repository at this point in the history
  • Loading branch information
Wondertan committed Aug 10, 2022
1 parent 8c335dd commit de5f85d
Showing 1 changed file with 190 additions and 15 deletions.
205 changes: 190 additions & 15 deletions docs/adr/adr-009-telemetry.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
* 2022-07-15: Stylistic improvements from @rootulp and @bidon15
* 2022-07-29: Formatting fixes
* 2022-07-29: Clarify and add more info regarding Uptrace
* 2022-08-09: Cover metrics and add more info about trace

## Authors

Expand Down Expand Up @@ -54,15 +55,16 @@ This ADR is intended to outline the decisions on how to proceed with:
* Integration plan according to the priorities and the requirements
* What observability tools/dependencies to integrate
* Integration design into Celestia-Node for each observability option
* A reference document explaining "whats" and "hows" during integration in some part of the codebase, e.g. new dev
* A reference document explaining "whats" and "hows" during integration in some part of the codebase
* A primer for any developer in celestia-node to quickly onboard into Telemetry

## Decisions

### Plan

#### First Priority

The first priority lies on "ShrEx" stack analysis results for Celestia project. The outcome will tell us whether
The first priority lies on "ShrEx" stack analysis results for Celestia project. The outcome will tell us whether
our current [Full Node reconstruction](https://github.com/celestiaorg/celestia-node/issues/602) qualities conforms to
the main network requirements, subsequently affecting the development roadmap of the celestia-node before the main
network launch. Basing on the former, the plan is focused on unblocking the reconstruction
Expand All @@ -74,43 +76,66 @@ so the decision for the celestia-node team is to cover with traces only the _nec
code as the initial response to the ADR, leaving the rest to be integrated in the background for the devs in the team
once they are free as well as for the efficient bootstrapping into the code for the new devs.

___Update:___ The `ShrEx` analysis is not the blocker nor the highest priority at the moment of writing.

#### Second Priority

The next biggest priority - incentivized Testnet can be largely covered with traces as well. All participant will submit
The next biggest priority - incentivized testnet can be largely covered with traces as well. All participant will submit
traces from their nodes to any provided backend endpoint by us during the whole network lifespan. Later on, we will be
able to verify the data of each participant by querying historical traces. This is the feature that some backend solutions
provide, which we can use as well to extract valuable insight on how the network performs in macro view.

Even though incentivised testnet goal can be largely covered by traces in terms of observability, the metrics for this
priority are desirable, as metrics provide:

* Easily queryable time-series data
* Extensive tooling to build visualization for that data

Both, can facilitate implementation of global network observability dashboard, participant validation for the goal.

#### Third Priority

Enabling total observability of the node through metrics and traces.

### Tooling/Dependencies

#### Golang API/Shim
#### Telemetry Golang API/Shim

The decision is to use [opentelemetry-go](https://github.com/open-telemetry/opentelemetry-go) for both Metrics and Tracing:

* Minimal and golang savvy API/shim which gathers years of experience from OpenCensus/OpenMetrics and [CNCF](https://www.cncf.io/)
* Backends/exporters for all the existing timeseries monitoring DBs, e.g. Prometheus, InfluxDB. As well as tracing backends
Jaeger, Uptrace, etc.
* <https://github.com/uptrace/opentelemetry-go-extra/tree/main/otelzap> with logging engine we use - Zap
* Provides first-class support/implementation for/of generic [OTLP](https://opentelemetry.io/docs/reference/specification/protocol/)(OpenTelemetry Protocol)
* Generic format for any telemetry data.
* Allows integrating otel-go once and use it with any known backend, either
* Supporting OTLP natively
* Or through [OTel Collector]((https://opentelemetry.io/docs/collector/))
* Allows exporting telemetry to one endpoint only([opentelemetry-go#3055](https://github.com/open-telemetry/opentelemetry-go/issues/3055))

The discussion over this decision can be found in [celestia-node#663](https://github.com/celestiaorg/celestia-node/issues/663)
and props to @liamsi for initial kickoff and a deep dive into OpenTelemetry.

#### Tracing Backends

For tracing, there are 3 modern OSS tools that are recommended. All of them have bidirectional support with OpenTelemetry:
For tracing, there are 4 modern OSS tools that are recommended. All of them have bidirectional support with OpenTelemetry:

* [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/)
* Tracing data proxy from a OTLP client to __any__ backend
* Supports a [long list of backends](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter)
* [Uptrace](https://get.uptrace.dev/guide/#what-is-uptrace)
* The most recent (~1 year)
* Tight to Clickhouse DB
* The most lightweight
* Supports OTLP
* OSS and can be deployed locally
* Provides hosted solution
* [Jaeger](https://www.jaegertracing.io/)
* The most mature
* Started by Uber, now supported by CNCF
* Supports multiple storages(ScyllaDB, InfluxDB, Amazon DynamoDB)
* Supports OTLP
* [Graphana Tempo](https://grafana.com/oss/tempo/)
* Deep integration with Graphana/Prometheus
* Relatively new (~2 years)
Expand All @@ -125,14 +150,43 @@ Each of these backends can be used independently and depending on the use case.

> I am personally planning to set up the lightweight Uptrace for the local light node. Just to play around and observe
> things
> UPDATE: It turns out it is not that straightforward and adds additional overhead. See #Other-Findings
>
> __UPDATE__: It turns out it is not that straightforward and adds additional overhead. See #Other-Findings
There is no strict decision on which of these backends and where to use. People taking ownership of any listed vectors
are free to use any recommended solution or any unlisted.
are free to use any recommended solution or any unlisted. The only backend requirement is the support of OTLP, natively
or through OTel Collector. The latter though, introduces additional infrastructure piece which adds unnecessary complexity
for node runners, and thus not recommended.

#### Metrics Backend

// WIP
We only consider OSS backends.

* [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/)
* Push based
* Metrics data proxy from a OTLP client to __any__ backend
* Supports a [long list of backends](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter)
* [Netdata](https://github.com/netdata/netdata)
* Push based
* Widely supported option in the Linux community
* Written in C
* Decade of experience optimized to bare metal
* Perfect for local monitoring setups
* Unfortunately, does not support OTLP
* [Uptrace](https://get.uptrace.dev/guide/#what-is-uptrace)
* The most recent (~1 year)
* Tight to Clickhouse DB
* The most lightweight
* Supports OTLP
* OSS and can be deployed locally
* Provides hosted solution
* Prometheus+Graphana
* Pull based
* No native OTLP support
* Thought there [spec](https://github.com/open-telemetry/wg-prometheus/blob/main/specification.md) to fix this
* Still, can be used with Otel Collector

Similarly, no strictness around backend solution with only OTLP support requirement. Natively or through OTLP exporter.

## Design

Expand Down Expand Up @@ -218,14 +272,14 @@ Here is the result of the above code sending traces visualized on Jaeger UI

#### Backends connection

Jaeger example
Example for Jaeger

```go
// Create the Jaeger exporter
exp, err := jaeger.New(jaeger.WithCollectorEndpoint(jaeger.WithEndpoint(url)))
if err != nil {
return nil, err
}
// Create the Jaeger exporter
exp, err := jaeger.New(jaeger.WithCollectorEndpoint(jaeger.WithEndpoint(url)))
if err != nil {
return nil, err
}
// then the tracer provider
tp := tracesdk.NewTracerProvider(
// Always be sure to batch in production.
Expand All @@ -245,19 +299,140 @@ Jaeger example
tp.Shutdown(ctx)
```

We decided to use OTLP backend, and it is almost similar in terms of setup.

### Metrics Design

// WIP
Metrics allows collecting time-series data from different measurable points in the application. Every measurable can
be covered via 6 instruments OpenTelemetry provides:

* ___Counter___ - synchronous instrument that measures additive non-decreasing values.
* ___UpDownCounter___ - synchronous instrument which measures additive values that increase or decrease with time.
* ___Histogram___ - synchronous instrument that produces a histogram from recorded values.
* ___CounterObserver___ - asynchronous instrument that measures additive non-decreasing values.
* ___UpDownCounterObserver___ - asynchronous instrument that measures additive values that can increase or decrease with time.
* ___GaugeObserver___ - asynchronous instrument that measures non-additive values for which sum does not produce a meaningful correct result.

#### Integration Example

Consider we want to know report current network height as a metric.

First of all, the global pkg meter has to be defined in the code related to the desired metric. In our case it is `header` pkg.

```go
var meter = global.MeterProvider().Meter("header")
```

Next, we should understand what instrument to use. On the first glance, for chain height a ___Counter___ instrument should
fit, as it is a non-decreasing value, and then we should think whether we need a sync or async version of it. For our case,
both would work and its more the question of precision we want. Sync metering would report every height change, while
the async would poke `header` pkg API periodically to get the metered data. For our example, we will go with the latter.

```go
// MonitorHead enables Otel metrics to monitor head.
func MonitorHead(store Store) {
headC, _ := meter.AsyncInt64().Counter(
"head",
instrument.WithUnit(unit.Dimensionless),
instrument.WithDescription("Subjective head of the node"),
)

err := meter.RegisterCallback(
[]instrument.Asynchronous{
headC,
},
func(ctx context.Context) {
head, err := store.Head(ctx)
if err != nil {
headC.Observe(ctx, 0, attribute.String("err", err.Error()))
return
}

headC.Observe(
ctx,
head.Height,
attribute.Int("square_size", len(head.DAH.RowsRoots)),
)
},
)
if err != nil {
panic(err)
}
}
```

The example follows a solely API-based approach without the need to integrate the metric deeper into the implementation
insides, which is nice and keeps metering decoupled from business logic. The `MonitorHead` func simply accepts the `Store`
interface and reads the information about the latest subjective header via `Head` on the node.

The API-based approach should be followed for any info level metric. Even if there is no API to get the required metric,
such API should be introduced. However, this approach is not always possible and sometimes deeper integration with code
logic is necessary to analyze performance or there are security and/or encapsulation considerations.

On the example, we can also see how any additional data can be added to the instruments via attributes or labels. It is
important to add only absolutely necessary data(more on that in Others section below) to the metrics or data which is
common over multiple time-series. In this case, we attach `square_size` of the height to know the block size of the
height. This allows us to query reported heights with some square size using backend UI. Note that there are only
powers of two(with 256 being a current limit) as unique values possible for the metric, so it won't put pressure on the
metrics backend.

For in Go code examples on other metric instruments consult [Uptrace Otel docs](https://uptrace.dev/opentelemetry/go-metrics.html#getting-started).

#### Backends Connection

Example for OTLP extracted from our code

```go
opts := []otlpmetrichttp.Option{
otlpmetrichttp.WithCompression(otlpmetrichttp.GzipCompression),
otlpmetrichttp.WithEndpoint(cmd.Flag(metricsEndpointFlag).Value.String()),
}
if ok, err := cmd.Flags().GetBool(metricsTlS); err != nil {
panic(err)
} else if !ok {
opts = append(opts, otlpmetrichttp.WithInsecure())
}

exp, err := otlpmetrichttp.New(cmd.Context(), opts...)
if err != nil {
return err
}

pusher := controller.New(
processor.NewFactory(
selector.NewWithHistogramDistribution(),
exp,
),
controller.WithExporter(exp),
controller.WithCollectPeriod(2*time.Second),
controller.WithResource(resource.NewWithAttributes(
semconv.SchemaURL,
semconv.ServiceNameKey.String(fmt.Sprintf("Celestia-%s", env.NodeType.String())),
// Here we can add more attributes with Node information
)),
)
```

## Considerations

* Tracing performance
* _Every_ method calling two more functions making network request can affect overall performance
* Metrics backend performance
* Mainly, we should avoid sending too much data to the metrics backend through labels. Metrics is only for metrics and
not for indexing.
e.g. hash, uuid, etc not to overload the metrics backend
* Security and exported data protection
* OTLP provides TLS support

## Other Findings

### Labels and High Cardinality

High cardinality(many different label values) issue should be always kept in mind while introducing new metrics
and labels for them. Each metric should be attached with only absolutely necessary labels and stay away from metrics
sending label __unique__ values each time, e.g. hash, uuid, etc. Doing the opposite can dramatically increase the
amount of data stored. See <https://prometheus.io/docs/practices/naming/#labels>.

### Tracing and Logging

As you will see in the examples below, tracing looks similar to logging and have almost the same semantics. In fact,
Expand Down

0 comments on commit de5f85d

Please sign in to comment.