-
Notifications
You must be signed in to change notification settings - Fork 50
Add request object influenced sampling option #182
Conversation
|
||
There are two ways to control the `Sampler` used: | ||
* Controlling the global default `Sampler` via [TraceConfig](https://github.com/census-instrumentation/opencensus-specs/blob/master/trace/TraceConfig.md). | ||
* Pass a specific `Sampler` as an option to the HTTP plugin. Plugins should support setting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please elaborate on scenarios when per-request sampler is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One scenario may be a sampler that is registered by exporter to filter out calls to that exporter endpoint. Is it intended use for this API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, can this callback be used to extract additional information from http object and set a new attribute on span?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The simplest scenario is to always sample when ?debug=true
or however that is propagated through a system. Other scenarios might be to always turn on tracing for developer requests or specific clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some examples, PTAL.
The only use case is not just disabling tracing but being able to give user availability to use different policies per path. E.g. some endpoints with low traffic may get a higher sampling rate.
This is why I think it is better to influence the sampling policy here rather than building exporter filters. We also care about the sampling option to be set because it will be propagated.
@derekperkins, thanks for the example. I added it to the list as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SergeyKanzhelev I proposed this #186 as a way to pass attributes from the Sampler to the Span.
``` | ||
type Transport struct { | ||
// GetStartOptions allows to set start options per request. | ||
GetStartOptions func(*http.Request) trace.StartOptions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see many uses for a more generic filter/sampler that works on collected span. Those would be easier to write and they will work independently from what auto-collector was enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean a hook to create a custom span in the HTTP integration? Or a filter/sampler once the span is collected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean second. Let's take Zipkin exporter. When you configure it - you need to remember now to configure all http collectors with specialized sampler (side note- are those chained? I might need to apply different policies per path AND filter out Zipkin). So Zipkin exporter needs to have separate packages with the reference on http libraries (think Java) that would return those samplers.
It would be much easier if Zipkin can only implement one filter that works on collected spans.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good alternative here if Zipkin before calling into endpoint will set a spacial tag on TLS that will have special treatment by collection modules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give few examples why a different sampling policy might be required per exporter? I haven't considered that case in this spec change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more thing. Each exporter is depending on a different protocol. Some do HTTP, some do gRPC, some Thrift over UDP, etc. I think there is no way to unify this given they are all based on different protocols. Each exporter should provide a way to set a policy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SergeyKanzhelev, do you have any additional comments on this?
I filed #188 to discuss more how filtering capabilities will work for exporters. I think they can be implemented once but set at multiple places because exporters' transports vary and request objects are transport specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. If we take filtering out of conversation - many scenarios may be easier to implement for customer if filters would not be protocol-specific and will only deal with populated Span
object. Than you can register global callback that will apply different sampling for different http.path
attribute independently of how it was constructed - using manual instrumentation or via automatic collection.
This note though doesn't make proposed callback less interesting. This callback makes it harder for customer to configure, but achieve better perf.
I'm good with this proposal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a separate item for global processors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TLDR; would be nice to focus this just on instrumentation sampling and not on things like exporter rate limiting etc.
side note, so take into consideration.. you don't have to answer each thought below :P
I don't really understand the link in this conversation about zipkin, exporter and instrumentation policy.
In Zipkin we don't recommend doing any sampling collector side except consistent on trace ID for overload scenarios. Note that pruning leaf spans (like resources) is a special case and not really a sampling decision, also there's no collector hook for that (even if there could be). A better way to address things like resource spans is to do that in the instrumentation layer. For example, sleuth has an exclude pattern for things like this that uses our http sampler to achieve that.
Almost always when we talk about sampling in zipkin we are talking about instrumentation. We don't combine rates (via pack-ins) either at the moment though you could imagine something that does this. I don't think the first implementation of sampling in census needs to deal with combinatory effects of rate, rather api design.
Maybe later works can address the combinatory things and how throttling could be effective. Throttling is complex when local intermediate spans (or tons of log-like events) are used as they can saturate transport/ storage budget as well. I think census has some fields that would need to be used like dropped spans etc. basically pruning etc probably is a different pull request/feature than top-level trace tier sampling.
030eb06
to
7425601
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL.
|
||
There are two ways to control the `Sampler` used: | ||
* Controlling the global default `Sampler` via [TraceConfig](https://github.com/census-instrumentation/opencensus-specs/blob/master/trace/TraceConfig.md). | ||
* Pass a specific `Sampler` as an option to the HTTP plugin. Plugins should support setting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some examples, PTAL.
The only use case is not just disabling tracing but being able to give user availability to use different policies per path. E.g. some endpoints with low traffic may get a higher sampling rate.
This is why I think it is better to influence the sampling policy here rather than building exporter filters. We also care about the sampling option to be set because it will be propagated.
@derekperkins, thanks for the example. I added it to the list as well.
``` | ||
type Transport struct { | ||
// GetStartOptions allows to set start options per request. | ||
GetStartOptions func(*http.Request) trace.StartOptions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean a hook to create a custom span in the HTTP integration? Or a filter/sampler once the span is collected?
@rakyll btw, this is how it works in Application Insights SDK today: http://apmtips.com/blog/2018/09/26/filtering-bad-sampling-good/ There are many perf tradeoff, but it definitely very easy for customer to configure and understand |
@bogdandrutu can you review please? |
Request object influenced sampling also allows us to implement blacklist/whitelist features.
Request object influenced sampling also allows
us to implement blacklist/whitelist features.