-
Notifications
You must be signed in to change notification settings - Fork 182
Standard(s) for in-process propagation #23
Comments
In-process propagation, and support different frameworks instrumented, seems very hard, as I known. In Java, same Context in ThreadLocal, and provide inter-thread propagation spec, maybe a effective way. Keep me in the loop. :) |
Last july @Xylus and @eirslett did a couple day spike on context.
Assets from those meetings are here;
https://drive.google.com/drive/u/1/folders/0B0tSnQT3uGdAfndwS3RKVnMtM0ZKMWU4R1dicTNEXy1NSlBJaTlJQ05abGVneTlucW5HdFk
Eirik ended up spiking an effort towards that later,
DistributedTracing/continuation-local-storage-jvm#1
Ironically, last december, iirc, distributed context propagation was
the first name of the OpenTracing group, so not too surprising it is
here.
Since then, I think uber had some work towards it naming the group
https://github.com/openctx
More recently, recently google are changing their own context to
decouple storage from it. grpc/grpc-java#2461
Also, other tools like Log4J2 have added hooks to interop with other's
context storage engines.
As mentioned in another issue, there's some layering concern, for
example if tracing is the right place to define context (as they are
often multi-purpose and below tracing).
That said, whatever hooks there are tracing is at least a primary
consumer, and interesting to see a tracking issue arise.
|
@adriancole what I find peculiar is that several similar concepts of context storage/provider I've seen (including Jaeger's) pretend that they are just an interface, but in fact they cannot work with anything but a thread-local based implementation. For example, TChannel (which is a form of RPC framework) can be configured with a TracingContext, as a way of abstracting the actual mechanism of in-process propagation, but the mere fact that a context object is not passed to RPC methods but instead extracted via that abstraction essentially means the abstraction must be implemented via thread-locals. NB: the TracingContext in Jaeger and TChannel is to a Span what gRPC Storage is to a Context. I.e. the same separation of how something is propagated vs. what is being propagated. They are just smaller in scope and only propagate a single Span. |
@adriancole what I find peculiar is that several similar concepts of context storage/provider I've seen (including Jaeger's) pretend that they are just an interface, but in fact they cannot work with anything but a thread-local based implementation.
interesting point for JVM tech. yeah different types of thread locals,
wrappers of thread locals, or thread-local references to something
else.. all have thread locals in there somewhere. Maybe there's some
subtle (or native) option I'm also unaware of. @raphw @tmontgomery you
know magic ( or at least where bodies are often buried ) .. do you
know something available on JVM that can be used to store incidental
trace state which isn't directly or indirectly a thread local?
|
As a matter of fact, I have written a small thread-local-ish utility for this exact purpose. The problem with thread locals is that it only allows access from within a Also, I have started migrating away from the above All in all, the problem always boils down to managing a form of global-state what is always tedious. |
@adriancole what is incidental trace state? Why should be stored in something like threadlocal? |
To all, threadlocal can't solve all problems, somthing like async process model will bring us more. For example, Disruptor is a very useful in-memory async lib. If application use this, all context base on threadlocal will fail without doubt. This is very important for my exp. |
@adriancole <https://github.com/adriancole> what is incidental trace
state? Why should be stored in something like threadlocal?
sorry poor choice of words. Most users aren't aware of trace state as it is
implicitly passed around on their behalf (of course some are). The usual
implicit mechanism in Java is a thread local, though it is problematic in
async libraries. For example, usually you have to manually re-attach the
state when some accident of thread scheduling loses it. Ex the below from
Brave + RxJava
public Action0 onSchedule(final Action0 action) {
final ServerSpanThreadBinder binder = brave.serverSpanThreadBinder();
final ServerSpan span = binder.getCurrentServerSpan();
return new Action0() {
@OverRide
public void call() {
binder.setCurrentSpan(span);
action.call();
}
};
}
https://github.com/smoketurner/dropwizard-zipkin/blob/master/zipkin-rx/src/main/java/com/smoketurner/dropwizard/zipkin/rx/BraveRxJavaSchedulersHook.java
One of the reasons I asked about alternatives to thread locals was to
figure out what current practices exist, as it is helpful to scope what
might end up in a spec or library here or elsewhere.
make sense?
|
@adriancole, yes, that is my concern. |
But I think only change threadlocal to something similar is not explicit enough to someone, who is not a tracer developer. Provide some APIs, like carrier, inject, extract for rpc, are better choice for me. They will be easy to understand. |
In terms of the specification, the more interesting question for me is the contract of this API:
It might be easier to decide which language-specific storage construct might be the right one, when we know which features we want it to support. |
I am creating a library called 'context-propagation' meant purely to propagate any (threadlocal) context implementation from a parent-thread to any background thread in a stack-like fashion, using the Autocloseable try-with-resources mechanism to allow verified stack unrolling (and automatic skipping if a pop is missed). (we needed it for security propagation, but are now looking into adding opentracing and would like to re-use the mechanism that is currently in-place that serves us well). |
@cwe1ss I think the only requirement is "I need to get the current span". The fact that Java implementation often uses stack is a side-effect of thread-local implementation, since you're using the same API to retrieve different spans at different times, and stack is the only thing that makes sense for that. |
@yurishkuro there's also the logical requirement to "set the current span". Do you think this is implicit - meaning that every span that is created is automatically "the current span" or do you see this as an explicit feature/step? In other words, do you think that the context should be part of the The former would make for a nice user-friendly API and the latter seems to be closer to what we have now in jaeger-context and my C# contrib library. But both have advantages and disadvantages that need further discussions of course. |
Sorry, you're right, it needs get and set fort current span. |
@sjoerdtalsma thanks for sharing the context propagation lib... nice to see something that's already in use and not pre-coupled to a distributed tracing system per se. @yurishkuro @cwe1ss I am prototyping a few approaches to make this pluggable... I would like OT to support in-process context managers without getting into the business of doing all of the work (since there are so many approaches and each has its advantages/disadvantages). @cwe1ss as far as the stack concept is concerned: in Dapper we basically had each in-process Context object keep a reference to its parent Context; when |
@bensigelman For what it's worth, I heavily depend on the the java.util.ServiceLoader to make my library pluggable. Details on the out-of-order closing: If the closed context isn't 'current', current isn't touched. Later today I'll be committing a 'java-globaltracer' project to github and planning to offer it to opentracing-contrib. There will be three main classes with only a single public class GlobalTracer (currently with 3 methods) following this design: http://bit.ly/2gkrRuo |
@sjoerdtalsma , can you post the site of java-globaltracer project. I have suggest earlier in opentracing/opentracing-java#63. ServiceLoader to get a tracer maybe a good way. Even though, some tracer need arguments to init, like Jaeger Tracer by @yurishkuro told, they are not suitable for the module. But I like this type of java tracer factory/manager very much. |
@wu-sheng @adriancole , I pushed an initial attempt (with a working unit test using the MockTracer) to github yesterday evening: I would love to donate it to the 'contrib' git repository and have it published in Maven central under 'io.opentracing.contrib' but only after a thorough review process. Since I'm relatively new to the opentracing community I have no experience on the best way to go forward from here... |
@sjoerdtalsma @adriancole , maybe we should create a new issue to discuss this? I review the project generally, let's leave the detail discuss later. But at first, jdk level should be fixed. Using |
@sjoerdtalsma <https://github.com/sjoerdtalsma> @adriancole
<https://github.com/adriancole> , maybe we should create a new issue to
discuss this?
One of the reasons we discussed thread local was yuri's question of whether
we can assume thread locals are used or not. as far as I can tell, we
cannot make that assumption. If that steers the api in a particular way,
then the examples were fruitful.
I agree that it is probably helpful to keep weeds or investigations on
different threads as doing so here might be overwhelming. probably best in
the actual repo of the code in question?
|
@wu-sheng @adriancole I welcome feedback on the design and implementation choices by github issues. |
@sjoerdtalsma , transfer to https://github.com/opentracing-contrib, LGTM. @yurishkuro @adriancole @bensigelman , what's your opinion? Maybe accept it, discuss on the repo, and let's see what is going on later? |
@sjoerdtalsma my preference would be to
|
@bensigelman , can you add me to the https://github.com/opentracing-contrib org. I am interested in @sjoerdtalsma's project. |
@sjoerdtalsma , let's me know, after you create a repo, and do a PR |
@bensigelman I agree on all three of your points. |
@bensigelman , maybe you should create a repo for @sjoerdtalsma |
@bhs If transferring a |
Definitely similar in spirit... The SpanClosure actually has an interface to implement, though, since it makes it clear that a SpanClosure/Snapshot may only be activated/deactivated once. |
Yes, that's a good point. I had addressed that in a previous version of this (that I didn't send out) where everything was reference-counted and I've added a new commit that includes a new method, The async code looks like this:
It's worth noting that existing synchronous start()/finish() code can work just as it does today. Stepping back a bit...This PR moves things in the direction of formalizing the difference between |
One question; are we 'okay' now with automatically/implicitly becoming child of the active span ? edit: I am personally okay with either choice as long as it is documented very clearly |
@bhs There may still be an issue if the parent scope is being managed by two separate event handlers, if the event handler creating the |
I'm definitely ok with that. We did so in Dapper and – while it has some drawbacks – it's a way better default than not-becoming the child :) Two thoughts, though:
IMO, if different executors (or whatever) start and finish a Span, they would not share a single SpanClosure activation/deactivation. Either there would be zero or two of those pairs. Note that in my proposal it's still possible to start a Span without meddling with SpanManager at all. |
A little later than I'd hoped, but here is a sincere attempt to address this issue for OT-Java: opentracing/opentracing-java#111 Please add comments there... I'm making special note of it because more people are watching this issue. |
The java implementation scares me, since this kind of complexity is absolutely not needed in a request scoped language such as PHP. Would it make more sense to agree that each language that does not have explicit in process context propagation such as Go, should provide a solution that is idiomatic for this language? |
Since I hear this a lot, not every PHP project is request scoped. PHP can be used just like any other language to build long-running service processes, that e.g. can serve RPCs calls over TCP, even with an event loop like in NodeJS. Take as an example https://github.com/felixfbecker/php-language-server |
@beberlei yes, the propagation mechanism is specific to each language, although it is preferable to keep similar concepts as much as possible, such as "active span". @felixfbecker if you look at a proposed mechanism for Python, it is actually a pluggable approach depending on which async framework in Python is used, because they require different ways of propagating the context via event loops. I assume the same will be true in PHP. For request-scoped PHP the implementation of active span source could be just a global variable. |
For request-scoped PHP that is not using async functions with callbacks (like async curl or async DB queries) and no event loops, it can be a global variable. A runtime also doesn't necessarily need to have a concept of "active span" imo. In Node's event loop continuation-local-storage is super slow to propagate an active span, and it is so much easier to e.g. just pass the span around as the last parameter to functions. |
@felixfbecker without an active span getter, it will be more cumbersome for users to combine their own instrumentation points with those built in to standard libraries (eg. web frameworks, db clients, etc), and to compose those with each other. |
@yurishkuro has the right nuance here. In-process propagation is language dependent, but for languages that share paradigms we should try to avoid unnecessary inconsistency between API implementations. For example, I would not want the ruby implementation to stray too far from the python one. @felixfbecker I think Javascript is a special case, where you do really want an ActiveSpan concept but the currently available mechanisms are insufficient/unofficial enough that it would be concerning to bake support for them into the official API. But if JS were to sort out continuation local storage in an effective manner, we would immediately want it for the reasons @dkuebric pointed out. PHP concerns me because I'm ignorant, and don't understand the various possible execution contexts and how they are exposed to the PHP runtime, and to what degree the tracing you need to do is actually mixed up with foreign function calls to C code. |
OpenTracing 1.x leaves the question of in-process context propagation entirely in the hands of the applications themselves, which substantially reduces the usefulness of the standard since different frameworks instrumented with OT do not have a way to exchange context.
Standards for in-process context propagation inevitably have to be specific to individual languages. This is more of a placeholder issue. It is also somewhat similar to #22 (possible part of it), but can be done independently.
The text was updated successfully, but these errors were encountered: