Skip to content

Add SpanProcessor for OpenTelemetry #875

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 37 commits into
base: master
Choose a base branch
from

Conversation

solnic
Copy link
Collaborator

@solnic solnic commented Mar 14, 2025

To make this work you need to add opentelemetry deps:

      {:opentelemetry, "~> 1.3"},
      {:opentelemetry_api, "~> 1.2"},
      {:opentelemetry_exporter, "~> 1.6"},
      {:opentelemetry_phoenix, "~> 2.0"},
      {:opentelemetry_ecto, "~> 1.2"},
      {:opentelemetry_bandit, "~> 0.2"}

Then configure our span processor:

config :opentelemetry, span_processor: {Sentry.OpenTelemetry.SpanProcessor, []}

Things should start looking more or less like this:

Screenshot 2025-03-19 at 11 32 54 Screenshot 2025-03-19 at 11 31 53 Screenshot 2025-03-19 at 11 33 03 Screenshot 2025-03-19 at 11 29 03

@solnic solnic linked an issue Mar 14, 2025 that may be closed by this pull request
@solnic solnic force-pushed the 874-add-spanprocessor-for-otel branch from 0ebd132 to dee9540 Compare March 14, 2025 11:17
@solnic
Copy link
Collaborator Author

solnic commented Mar 14, 2025

1) test loads in_app_module_allow_list (Public.Sentry.ConfigTest)
Error:      apps/public/test/public_test.exs:4
     ** (RuntimeError) the Sentry configuration seems to be not available (while trying to fetch :in_app_module_allow_list). This is likely because the :sentry application has not been started yet. Make sure that you start the :sentry application before using any of its functions.

     code: assert Sentry.Config.in_app_module_allow_list() |> Enum.sort() ==
     stacktrace:
       (sentry 10.8.1) lib/sentry/config.ex:698: Sentry.Config.in_app_module_allow_list/0
       test/public_test.exs:5: (test)


Finished in 0.00 seconds (0.00s async, 0.00s sync)
1 test, 1 failure
==> admin
Running ExUnit with seed: 25[95](https://github.com/getsentry/sentry-elixir/actions/runs/13856201920/job/38773483717?pr=875#step:10:96)1, max_cases: 8



  1) test loads in_app_module_allow_list (Admin.Sentry.ConfigTest)
Error:      apps/admin/test/admin_test.exs:4
     ** (RuntimeError) the Sentry configuration seems to be not available (while trying to fetch :in_app_module_allow_list). This is likely because the :sentry application has not been started yet. Make sure that you start the :sentry application before using any of its functions.

     code: assert Sentry.Config.in_app_module_allow_list() |> Enum.sort() ==
     stacktrace:
       (sentry 10.8.1) lib/sentry/config.ex:6[98](https://github.com/getsentry/sentry-elixir/actions/runs/13856201920/job/38773483717?pr=875#step:10:99): Sentry.Config.in_app_module_allow_list/0
       test/admin_test.exs:5: (test)

@whatyouhide ☝🏻 I'm having trouble figuring out these failures. Could you tell me how I could debug what exactly causes the :sentry app startup failure? We start it manually in the umbrella test helpers and for whatever reason it started failing when opentelemetry is included.

@whatyouhide
Copy link
Collaborator

Adding just the OTel deps without any other changes leads to the same issue?

@sl0thentr0py
Copy link
Member

sl0thentr0py commented Mar 17, 2025

please write some minimal description in the PR about

  • what functionality this adds
  • instructions to add this to a Phoenix app with opentelemetry instrumentation and test it out

@solnic
Copy link
Collaborator Author

solnic commented Mar 18, 2025

Adding just the OTel deps without any other changes leads to the same issue?

@whatyouhide @sl0thentr0py I narrowed it down to opentelemetry dep. When it's included, the app doesn't start in umbrella integration test. We need to make otel deps optional anyway, so it's time to address this. How do we want to approach it though? Should I create sentry-opentelemetry package in this repo?

@solnic
Copy link
Collaborator Author

solnic commented Mar 18, 2025

@whatyouhide for the time being I addressed it by using optional deps for otel packages via 6fdf121 but then one of the tests in event_test.exs started to fail so I fixed it via d04c5d3 even though I don't understand what's going on there 🙃

@solnic solnic mentioned this pull request Mar 18, 2025
@whatyouhide
Copy link
Collaborator

@solnic is this ready for review? It's still a draft

@solnic
Copy link
Collaborator Author

solnic commented Mar 18, 2025

@solnic is this ready for review? It's still a draft

Not yet. I got it working but phoenix + bandit spans are not processed in a way that would make sense for Sentry for some reason. I've been investigating how to fix. It seems like phoenix spans are not coming in as children of bandit spans so there's a disconnect here. I'll figure it out 🙂

@sl0thentr0py
Copy link
Member

We need to make otel deps optional anyway, so it's time to address this.

If we're just shipping a SpanProcessor first, having them as peer dependencies is more than fine. We just document well how to get otel up and running in parallel to sentry and hook them up properly.

We will revisit packaging later once we have a proper working prototype.

@solnic solnic force-pushed the 874-add-spanprocessor-for-otel branch 2 times, most recently from b607ac1 to bdcc5de Compare March 19, 2025 12:45
@solnic solnic marked this pull request as ready for review March 19, 2025 12:59
@solnic
Copy link
Collaborator Author

solnic commented Mar 19, 2025

@sl0thentr0py @whatyouhide this is now open for reviews. I got it deployed to production already and it's working well (see screenshots in the description).

@sl0thentr0py
Copy link
Member

okay I'm taking a preliminary look now

one thing we definitely need is that instead of the boolean Config.tracing we need

without these, it is very hard for people to control quota spend so this is a hard requirement

Copy link
Member

@sl0thentr0py sl0thentr0py left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some suggestions, looks very good otherwise!

alias Sentry.Interfaces.Span

@impl true
def on_start(_ctx, otel_span, _config) do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way you've implemented it, we can do nothing in on_start since the on_end logic is sufficient to build the span tree and transaction.

This will also simplify your SpanStorage.update_span where you can remove the unnecessary deletes.
(This is also how we've done it in the new python and JS span processors.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sl0thentr0py I'll dig into this more but when I make on_start a no-op the tests are no longer passing 🙃

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@solnic did you dig into it more?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@whatyouhide yes but not until now I managed to figure out how to do it to make this work, I just pushed 9a5c550 that simplified things

Copy link
Collaborator

@whatyouhide whatyouhide left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking like a good start but still a bunch of work left to do. Let me know if any of the comments are not clear!

@solnic solnic force-pushed the 874-add-spanprocessor-for-otel branch from c9152c6 to 1654724 Compare April 3, 2025 10:44
@solnic
Copy link
Collaborator Author

solnic commented Apr 3, 2025

without these, it is very hard for people to control quota spend so this is a hard requirement

@sl0thentr0py I'll add traces_sampler config in a separate PR

@solnic solnic force-pushed the 874-add-spanprocessor-for-otel branch from 01585a8 to 5e8fc99 Compare April 3, 2025 11:07
@solnic
Copy link
Collaborator Author

solnic commented Apr 3, 2025

@whatyouhide @sl0thentr0py ready for another round of reviews 🤞🏻

@whatyouhide
Copy link
Collaborator

I'll review tomorrow.

@@ -0,0 +1,31 @@
if Code.ensure_loaded?(:otel_sampler) do
defmodule Sentry.OpenTelemetry.Sampler do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on why we need a Sentry-specific sampler?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@whatyouhide in order to provide a standard interface for traces_sampler option we need a custom abstraction for this

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on this with links to docs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@whatyouhide Sentry has a standard way of handling trace sampling via traces_sampler configuration option which is meant to be a function that receives a sampling context. In order to provide support for this setting we just need our own Sampler that would call configured tracer sampler with a context that it would be built from information that it receives in the should_sample callback.

Here are docs about this feature in the Ruby SDK as a reference https://docs.sentry.io/platforms/ruby/configuration/sampling/#sampling-transaction-events

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect the Sampler to be my application-specific concern; this module seems unnecessary to me

require OpenTelemetry

@fields Record.extract(:span, from_lib: "opentelemetry/include/otel_span.hrl")
Record.defrecordp(:span, @fields)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this conversation is resolved?

@solnic solnic requested a review from whatyouhide April 8, 2025 11:07
@jwaldrip
Copy link

This feels so close! How far away do we think it is from release?

@solnic
Copy link
Collaborator Author

solnic commented Apr 14, 2025

This feels so close! How far away do we think it is from release?

@jwaldrip yes just waiting for @whatyouhide for the final review :) then I'll follow up with another PR with support for sampler (initial work is already done) and we should be ready to release it 🙂

@jwaldrip
Copy link

@solnic I am pulling the branch directly and testing with our self hosted instance of sentry. Is there anything we need to do in our API configuration to have these show up in sentry under performance?

@solnic
Copy link
Collaborator Author

solnic commented Apr 16, 2025

@solnic I am pulling the branch directly and testing with our self hosted instance of sentry. Is there anything we need to do in our API configuration to have these show up in sentry under performance?

@jwaldrip I'm sorry but I'm not familiar with self hosted instances. I need to confirm if anything extra is needed. For now please try to replicate setup that I added to our test_integrations/phoenix_app test app 🙂 Thanks for helping with testing!

Copy link
Collaborator

@whatyouhide whatyouhide left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not really ready for review, there are quite a few comments that have been resolved without being addressed or comments that have not been addressed in general.

This is a big, complex PR, so I would in general suggest splitting it up in "less controversial" pieces. For example, adding a :traces_sample configuration is very straightforward, and if it were its own PR, then it could be a piece we merge and not think about. Same goes for adding tests to the Phoenix app, for example.

In general, for this kind of workflow I suggest something like stacked PRs if you're familiar with the concept.

alias Sentry.Interfaces.Span

@impl true
def on_start(_ctx, otel_span, _config) do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@solnic did you dig into it more?

Comment on lines +138 to +143
defp get_op_description(%{
attributes:
%{unquote(to_string(MessagingAttributes.messaging_system())) => :oban} = attributes
}) do
{"queue.process", attributes["oban.job.worker"]}
end
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't the Oban/OTel integration be the one responsible for getting this right here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@whatyouhide queue.process is a Sentry concept, that's why we have to handle it ourselves here. From Sentry docs:

queue.process

Based on the knowledge sources, queue.process is a span operation used in Sentry's tracing instrumentation for queue consumers. This operation is used to track when messages are delivered to or processed by a consumer in a messaging system.

Consumer Instrumentation

To implement queue consumer instrumentation with Sentry, you need to:
Create a span with the operation queue.process
Include specific attributes to enrich your consumer spans with queue metrics

@@ -0,0 +1,31 @@
if Code.ensure_loaded?(:otel_sampler) do
defmodule Sentry.OpenTelemetry.Sampler do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on this with links to docs?

end

defp setup_span_storage(opts) do
uid = :rand.uniform(1000000)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be unique?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then :rand.uniform/1 won't guarantee that in any way. Use System.unique_integer([:positive]) instead.

@solnic
Copy link
Collaborator Author

solnic commented Apr 18, 2025

This is not really ready for review, there are quite a few comments that have been resolved without being addressed or comments that have not been addressed in general.

Oops, seems like I've got a commit from April 4th that I forgot to push. I used GH PR ext in VSCode to mark things as done but then my branch wasn't synced. Dooh! Sorry about that. The docs were updated according to your requests via f883094

In general, for this kind of workflow I suggest something like stacked PRs if you're familiar with the concept.

Yes I'm familiar with this approach. Should I start doing it here?

@solnic solnic force-pushed the 874-add-spanprocessor-for-otel branch from 7b00dd9 to 0559b9e Compare April 23, 2025 07:29
@solnic solnic requested a review from whatyouhide April 23, 2025 07:54
@jwaldrip
Copy link

@solnic are these the final parts that need to be reviewed by @whatyouhide? Hoping we can see this released this month. 🥺

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add SpanProcessor for OTel
5 participants