-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Update Net::HTTP
instrumentation to no-op on untraced contexts
#722
fix: Update Net::HTTP
instrumentation to no-op on untraced contexts
#722
Conversation
def tracer_stub.in_span(...) | ||
raise 'tracer.in_span should not have been called' | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with Minitest. If you know a simpler or more readable way of asserting that the #in_span
method is not called on trace
, or even a simpler way of asserting that the instrumentation no-ops (i.e., does not attempt to manipulate spans in any way), I'm open to suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that other teste suites use rspec-mocks
to deal with scenarios like this. If using that library provides you with an easier way to write interaction tests.
The other option is to not worry about verifying interactions with test doubles and rely on the state-based test results, i.e. No spans are created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using rspec-mocks
would definitely make this easier/simpler to do. I didn't know that it could be used with Minitest. Can I add it to opentelemetry-instrumentation-net_http.gemspec as a dev dependency?
The other option is to not worry about verifying interactions with test doubles and rely on the state-based test results, i.e. No spans are created.
That was my initial approach, but then I realized that the call to tracer.in_span
will cause called finish on an ended Span
log noise even without creating a span. That's because Tracer#in_span
calls Tracer#start_span
, which checks span.context.valid?
(which is false
for an untraced context), so it returns Span::INVALID
. So the count of finished spans does not increase. That doesn't prevent it from calling finish
in an untraced
span though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using rspec-mocks would definitely make this easier/simpler to do. I didn't know that it could be used with Minitest. Can I add it to opentelemetry-instrumentation-net_http.gemspec as a dev dependency?
Yes, please add it. I will create a chore issue to update all the test specifications to include it in development dependencies so folks don't run into similar constraints in the future.
That doesn't prevent it from calling finish in an untraced span though.
I see. We use in_span
all over the place. Is this a problem in other instrumentations and we just haven't seen the reports?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please add it.
Will do, and thank you.
Is this a problem in other instrumentations.
I don't think so. At least in our app, the issue is triggered by a specific interaction between the Resque
instrumentation configured with the force_flush: :always
option and the Net::HTTP
instrumentations, as described in the description of this PR. When the Resque
instrumentation calls force_flush
, the OTLP exporter says not to trace its HTTP requests to the APM server but the Net::HTTP
is unaware of the untraced: true
attribute in context, so it tries to add info to the span and then call finish
on it. And that causes the log noise.
Here's a sample (partial) stack trace:
/usr/local/bundle/gems/opentelemetry-instrumentation-net_http-0.22.2/lib/opentelemetry/instrumentation/net/http/patches/instrumentation.rb:80:in `annotate_span_with_response!'
/usr/local/bundle/gems/opentelemetry-instrumentation-net_http-0.22.2/lib/opentelemetry/instrumentation/net/http/patches/instrumentation.rb:39:in `block (2 levels) in request'
<internal:kernel>:90:in `tap'
/usr/local/bundle/gems/opentelemetry-instrumentation-net_http-0.22.2/lib/opentelemetry/instrumentation/net/http/patches/instrumentation.rb:38:in `block in request'
/usr/local/bundle/gems/opentelemetry-api-1.2.3/lib/opentelemetry/trace/tracer.rb:37:in `block in in_span'
/usr/local/bundle/gems/opentelemetry-api-1.2.3/lib/opentelemetry/trace.rb:70:in `block in with_span'
/usr/local/bundle/gems/opentelemetry-api-1.2.3/lib/opentelemetry/context.rb:87:in `with_value'
/usr/local/bundle/gems/opentelemetry-api-1.2.3/lib/opentelemetry/trace.rb:70:in `with_span'
/usr/local/bundle/gems/opentelemetry-api-1.2.3/lib/opentelemetry/trace/tracer.rb:37:in `in_span'
/usr/local/bundle/gems/opentelemetry-instrumentation-net_http-0.22.2/lib/opentelemetry/instrumentation/net/http/patches/instrumentation.rb:31:in `request'
/usr/local/bundle/gems/opentelemetry-exporter-otlp-0.26.1/lib/opentelemetry/exporter/otlp/exporter.rb:160:in `block (2 levels) in send_bytes'
/usr/local/bundle/gems/opentelemetry-exporter-otlp-0.26.1/lib/opentelemetry/exporter/otlp/exporter.rb:238:in `measure_request_duration'
/usr/local/bundle/gems/opentelemetry-exporter-otlp-0.26.1/lib/opentelemetry/exporter/otlp/exporter.rb:160:in `block in send_bytes'
/usr/local/bundle/gems/opentelemetry-exporter-otlp-0.26.1/lib/opentelemetry/exporter/otlp/exporter.rb:128:in `block in around_request'
/usr/local/bundle/gems/opentelemetry-common-0.20.0/lib/opentelemetry/common/utilities.rb:92:in `block in untraced'
/usr/local/bundle/gems/opentelemetry-api-1.2.3/lib/opentelemetry/context.rb:87:in `with_value'
/usr/local/bundle/gems/opentelemetry-common-0.20.0/lib/opentelemetry/common/utilities.rb:91:in `untraced'
/usr/local/bundle/gems/opentelemetry-exporter-otlp-0.26.1/lib/opentelemetry/exporter/otlp/exporter.rb:128:in `around_request'
/usr/local/bundle/gems/opentelemetry-exporter-otlp-0.26.1/lib/opentelemetry/exporter/otlp/exporter.rb:152:in `send_bytes'
/usr/local/bundle/gems/opentelemetry-exporter-otlp-0.26.1/lib/opentelemetry/exporter/otlp/exporter.rb:86:in `export'
/usr/local/bundle/gems/opentelemetry-sdk-1.3.0/lib/opentelemetry/sdk/trace/export/batch_span_processor.rb:187:in `block in export_batch'
/usr/local/bundle/gems/opentelemetry-sdk-1.3.0/lib/opentelemetry/sdk/trace/export/batch_span_processor.rb:187:in `synchronize'
/usr/local/bundle/gems/opentelemetry-sdk-1.3.0/lib/opentelemetry/sdk/trace/export/batch_span_processor.rb:187:in `export_batch'
/usr/local/bundle/gems/opentelemetry-sdk-1.3.0/lib/opentelemetry/sdk/trace/export/batch_span_processor.rb:114:in `force_flush'
/usr/local/bundle/gems/opentelemetry-sdk-1.3.0/lib/opentelemetry/sdk/trace/tracer_provider.rb:109:in `block (2 levels) in force_flush'
/usr/local/bundle/gems/opentelemetry-sdk-1.3.0/lib/opentelemetry/sdk/trace/tracer_provider.rb:105:in `map'
/usr/local/bundle/gems/opentelemetry-sdk-1.3.0/lib/opentelemetry/sdk/trace/tracer_provider.rb:105:in `block in force_flush'
/usr/local/bundle/gems/opentelemetry-sdk-1.3.0/lib/opentelemetry/sdk/trace/tracer_provider.rb:101:in `synchronize'
/usr/local/bundle/gems/opentelemetry-sdk-1.3.0/lib/opentelemetry/sdk/trace/tracer_provider.rb:101:in `force_flush'
/usr/local/bundle/gems/opentelemetry-instrumentation-resque-0.5.0/lib/opentelemetry/instrumentation/resque/patches/resque_job.rb:60:in `perform'
/usr/local/bundle/gems/resque-2.6.0/lib/resque/worker.rb:322:in `perform'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR updated.
Update Net::HTTP instrumentation to no-op on HTTP requests within untraced contexts. Prevents the logging of "called finish on an ended Span" messages. Example scenario: The Resque instrumentation can be configured to call `OpenTelemetry::SDK::TraceProvider#force_flush` at the end of each job. That causes the batch span processor to immediately send everything it has to the OTLP exporter. The exporter then marks the HTTP request used to send the data to the APM server as "untraced", by adding an `untraced` key to the context and setting it to `true`. Before this commit, the `Net::HTTP` instrumentation did not check for the `untraced` flag. So it would try set attributes and call `finish` on a span that was not recording, creating one "called finish on an ended Span" log entry for each batch of exported spans. On a large application, that results in millions of those log entries per minute.
0c95add
to
75a3aed
Compare
Update
Net::HTTP instrumentation
to no-op on HTTP requests within untraced contexts. Prevents the logging ofcalled finish on an ended Span
messages.Example scenario:
The Resque instrumentation can be configured to call
OpenTelemetry::SDK::TraceProvider#force_flush
at the end of each job. That causes the batch span processor to immediately send everything it has to the OTLP exporter. The exporter then marks the HTTP request used to send the data to the APM server as "untraced", by adding anuntraced
key to the context and setting it totrue
. Before this PR, theNet::HTTP
instrumentation did not check for theuntraced
flag. So it would try set attributes and callfinish
on a span that was not recording, creating one "called finish on an ended Span" log entry for each batch of exported spans. On a large application, that results in millions of those log entries per minute.