-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hierarchical and Rate Limited Sampler Implementations #717
Hierarchical and Rate Limited Sampler Implementations #717
Conversation
a40750d
to
be451b2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great start thank you!
Please add a stanza in our GH Actions to include tests for these gems:
Also, please register the gems in the release list:
https://github.com/open-telemetry/opentelemetry-ruby-contrib/blob/main/.toys/.data/releases.yml
Lastly, please include the gems to the installation test suite here:
https://github.com/open-telemetry/opentelemetry-ruby-contrib/blob/main/releases/Gemfile
``` | ||
|
||
Or, if you use [bundler][bundler-home], include `opentelemetry-sampling-hierarchical` in your `Gemfile`. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be helpful to add example usage in the README and/or an example
directory with a script that shows how to configure it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, there is no way yet to use it. This will require an update to https://github.com/open-telemetry/opentelemetry-ruby/blob/d365dfeb3982f09cd62a17c6e67eb3792f56e460/sdk/lib/opentelemetry/sdk/trace/tracer_provider.rb#L167. I will open a PR to support the X-Ray sampler there as well and will adjust the README afterwards accordingly if that's fine with you
sampling/hierarchical/lib/opentelemetry-sampling-hierarchical.rb
Outdated
Show resolved
Hide resolved
|
||
module OpenTelemetry | ||
module Sampling | ||
module Hierarchical |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is Hierarchical Sampler a term used in X-Ray to describe this pattern? As opposed to a Any
or All
sampler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not but I find Any
sampler to be a bit misleading since it doesn't imply a fixed order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this name is not intuitive to me. Would you be able to help me understand what makes it Hierarchical or Ranked ordering vs "If any of the items in this list return true regardless of order"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The list of samplers is iterated left to right. If any returns a positive response, we stop and say that the span should be sampled. Any
doesn't give you this guarantee (in my interpretation).
The order can make a difference. A contrived example would be having two samplers:
- Sampler
A
: one per second, only ifkind == :internal
- Sampler
B
: one per second, only ifkind in [:server, :internal]
Let's say the application produces two traces in the same second, one with kind = :internal
and one with kind = :server
(in this order). If sampler A
was applied first, both traces would be sampled. On the other hand, if sampler B
was first, only the first (:internal
) trace would be sampled since B
already spent all its credits when the second trace arrives.
sampling/hierarchical/lib/opentelemetry/sampling/hierarchical/version.rb
Outdated
Show resolved
Hide resolved
# | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
module OpenTelemetry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the package the package naming should be symmetric with the SDK:
OpenTelemetry::SDK::Trace::Samplers
Or more use the API namespace
OpenTelemetry::Trace::Samplers
@open-telemetry/ruby-contrib-maintainers WDYT?
sampling/hierarchical/test/opentelemetry/sampling/hierarchical/sampler_test.rb
Show resolved
Hide resolved
sampling/rate_limited/lib/opentelemetry/sampling/rate_limited/sampler.rb
Outdated
Show resolved
Hide resolved
sampling/rate_limited/lib/opentelemetry/sampling/rate_limited/version.rb
Outdated
Show resolved
Hide resolved
sampling/rate_limited/test/opentelemetry/sampling/rate_limited/sampler_test.rb
Outdated
Show resolved
Hide resolved
# @param [Hash<String, Object>] attributes | ||
# @return [OpenTelemetry::SDK::Trace::Samplers::Result] The sampling result. | ||
def should_sample?(trace_id:, parent_context:, links:, name:, kind:, attributes:) | ||
if can_spend? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sampling is done per span so if you run out of credits_per_second mid trace you will start dropping spans, and potentially end up with broken traces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this be better solved by instantiating a ParentBased
sampler with the RateLimited
sampler as root
? I'D rather avoid making these distinctions here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by making the parent based sampler as root?
Wouldn't that make it so the rate limited sampling result is always ignored since the Hierarchical Sampler will always check the other samplers to see if it should override the result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot follow. Are we talking about the hierarchical or rate-limited sampler?
def can_spend? | ||
return false if @credits_per_second <= 0 | ||
|
||
@lock.synchronize do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sampler is set on the tracer provider so you'll be locking every time you create a span, this may not be appropriate depending on how much span volume your service is generating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you suggest being optimistic and not locking at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we have a suggestion as much as we're doing is raising concerns about the performance implications of adding another lock around span creation.
I'm not sure how much latency this introduces for applications that generate thousands of spans in a single transaction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since default thread counts are fairly low in most server implementations (in Puma it's 5
, for instance for MRI). We can also provide a configuration option for that
private | ||
|
||
# @return [Boolean] | ||
def can_spend? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we care about how this performs in the case where an app is configured with Puma, with Unicorn? Forking vs Pre-forking web servers will result in different spend rates. What's the expected outcome here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point that I haven't considered. I believe that the best option is to not make any assumptions about this at all. Configuring the rate accordingly could restore the parity between forking and pre-forking servers. Should we note this in the README?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more that I think we want clarity on what the expectations are for the scope of the rate limiter.
Does the rate limit apply per process, per thread, or per fiber?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per process makes most sense, I believe. I don't see why different processes should have this knowledge of each other. It also makes the handling easier as we don't need to know the details of different servers
be451b2
to
431f98e
Compare
…ng-rate-limited in releases.yml
431f98e
to
d071542
Compare
👋 This pull request has been marked as stale because it has been open with no activity. You can: comment on the issue or remove the stale label to hold stale off for a while, add the |
The hierarchical sampler allows for specifying a list of samplers that will be invoked in order. It will sample a request if any of its children would.
The Rate Limited sampler on the other hand samples requests at a fixed rate.
This is a prerequisite for #714