Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more tracing instrumentation #885

Merged
merged 7 commits into from
Mar 13, 2024
Merged

Add more tracing instrumentation #885

merged 7 commits into from
Mar 13, 2024

Conversation

sevein
Copy link
Contributor

@sevein sevein commented Mar 11, 2024

This pull request adds more instrumentation in different areas of the project. It includes multiple commits addressing different concerns. The main four changes are:

  • It configures a tracer provider in both enduro-a3m-worker and enduro-am-worker (enduro was already configured) and enables telemetry passing environment strings to the corresponding k8s workloads,
  • It enables automatic instrumentation of database/sql via github.com/XSAM/otelsql,
  • It enables automatic instrumentation of the Temporal Go SDK using a client interceptor, and
  • It enables route tagging in Goa (required to bump its version).

With these changes, we'll see a significant increase in the number of traces captured by Tempo. We're not yet initiating a trace during the transfer submission via MinIO; however, an interesting example that can already be observed is the move workflow initiated through the API. If you've already moved a package, use this query to find related traces:

{ .http.route="/package/{id}/move" }

It should look like this:

image

Something that becomes apparent right away is the lag between activity executions - we know that's the cost of using Temporal to schedule activities. It also shows that local activities are much faster since they're not coordinated with the Temporal server.

You can also search for a workflow directly, e.g.:

{name="StartWorkflow:processing-workflow"}

This trace is interesting because it shows that the telemetry data originates from multiple services, specifically "enduro" and "enduro-a3m-worker", rather than a singular source.

@sevein sevein force-pushed the dev/add-more-tracing branch 2 times, most recently from c74ad70 to 69d04de Compare March 11, 2024 11:55
@artefactual-sdps artefactual-sdps deleted a comment from codecov bot Mar 11, 2024
@artefactual-sdps artefactual-sdps deleted a comment from codecov bot Mar 11, 2024
@sevein sevein force-pushed the dev/add-more-tracing branch 2 times, most recently from 36ac9f3 to b19f72c Compare March 11, 2024 12:21
sevein added 5 commits March 11, 2024 12:49
Pass config via envs to enable tracing in all services.
It uses github.com/XSAM/otelsql to wrap our database drivers with
instrumentation enabled. I've configured the tracer provider in both
enduro-a3m-worker and enduro-am-worker.
It includes a new capability to enable route tagging via OpenTelemetry.
This is a concern now addressed by otelhttp (traceIDs).
This commit ensures that spans and metrics are annotated with the route name
which is provided by Goa. For example, in Tempo we can now retrieve traces with
a query like the following:

    {.http.route="/storage/location"}
@sevein sevein force-pushed the dev/add-more-tracing branch from b19f72c to 2a4e87e Compare March 11, 2024 12:59
@sevein sevein force-pushed the dev/add-more-tracing branch from 2a4e87e to b7c471e Compare March 11, 2024 13:09
@artefactual-sdps artefactual-sdps deleted a comment from codecov bot Mar 11, 2024
Copy link
Collaborator

@djjuhasz djjuhasz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👍

@sevein sevein merged commit cbd1a6a into main Mar 13, 2024
11 checks passed
@sevein sevein deleted the dev/add-more-tracing branch March 13, 2024 11:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants