Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDI context propagation #167

Open
FroMage opened this issue Jul 9, 2019 · 32 comments
Open

CDI context propagation #167

FroMage opened this issue Jul 9, 2019 · 32 comments
Labels
enhancement New feature or request

Comments

@FroMage
Copy link
Contributor

FroMage commented Jul 9, 2019

I didn't see in the spec anywhere where we define how CDI should propagate its context with CP. In particular, the current implementation of CDI context propagation turns out to have confusing semantics, where it does not capture the request context entirely, but only the values currently added to the context.

My opinion was that it was obvious that when we want the request context to be captured, it has to be the entire request context (which is a mutable Map), which means being able to see all modifications to it that happened before. Not just the current values defined in the request context.

So it was implemented as capturing values currently active at this point, which is wrong IMO and pretty useless, unfortunately. So perhaps we need to clarify this in the spec?

To explain why we want to propagate the entire mutable context, here's an imperative example:

// nothing in the current request context
a(); // causes lazy beans to be added to the request context
b(); // can use the added request-context beans

It's pretty obvious that the request context (like the application context) is a mutable Map that gets mutated, and "happens-before" modifications are visible to later instructions. It's a context that follows the current request.

Now here's the same application in a reactive fashion:

// nothing in the current request context
return threadContext.withContextPropagation(CompletableFuture.completedFuture(null))
 .thenAccept(v-> a()) // causes lazy beans to be added to the request context
 .thenAccept(v -> b()); // can use the added request-context beans

To any reactive user, it's pretty obvious that this should work exactly the same as the imperative case. Except with the current implementation of CDI context propagation, the request context is captured before a() adds things to it, and restored empty so that b() has request-scoped beans vanish from its request scope.

In practice, this causes the request scope to not really be propagated in a reactive pipeline (since "happens-before" mutations are not always visible). It differs from the imperative code, and does not follow the principle of least surprise (beans vanishing from the request context and potentially later coming back to it are very hard to debug).

It also makes lazy request-scoped beans unusable with reactive applications, since to make them properly follow the request lifecycle we would require to eagerly load them all at the start of the request, which is very bad for performance.

Finally, note that this is not regular at all with the application scope, which is propagated entirely, with the right "happens-before" mutations visible.

I didn't think this would be something the spec should define, since it's CDI-specific, and I thought pretty obvious, but perhaps we can make this generic and specify that while context-propagation is ThreadContextProvider-specific, we encourage behaviour to follow the "happens-before" semantics for mutable contexts, so that reactive code behaves in the same way to imperative code.

@FroMage FroMage added the enhancement New feature or request label Jul 9, 2019
@FroMage
Copy link
Contributor Author

FroMage commented Jul 9, 2019

Addentum: the practise of capturing/restoring values inside a request context (as opposed to just propagating the request context entire) appears to cause values to never be destroyed, which is a horrible source of bugs, with DB connections obtained more than once per request, and never released.

@manovotn
Copy link
Contributor

manovotn commented Jul 9, 2019

(CDI) Application context is not part of context propagation here as that context is backed by a concurrent map and all instances are shared across application. This has implications for users that define such beans.

Opposed to that are request scoped beans, stored in a thread local map. Current propagation is achieved by copying req. scoped instances from current active context to the new thread where the context is activated (without any link to the original HTTP request by the way).

Note that contexts can be manipulated by events, users or frameworks and can end abruptly leading to destruction of contextual instances during context deactivation and destruction.
Sharing whole underlying storage would mean that the storage (map) needs to be thread-safe which translates into somewhat slower in performance but it would also be risky for when any thread ends its request context, it will attempt to destroy all instances despite them being referenced in other threads and possibly still used.

As for what you try to achieve - the "happens-before" sequence of events - this seems like a more common problem that could affect more contexts than just CDI. I might be missing something but...
What you effectively ask for is to capture a snapshot that is dynamic which kinda defeats the design that was made here. I think MP CP tried to make sure that the same context can be applied repeatedly with same results which is what you are seeing. E.g.:

return threadContext.withContextPropagation(CompletableFuture.completedFuture(null))

this creates the immutable snapshot, in your example with no req. bean active

.thenAccept(v-> a()) // causes lazy beans to be added to the request context

this applies the snapshot (makes sure there is req. context active but with no beans) and runs something on it

.thenAccept(v -> b()); // can use the added request-context beans

here you are running on the same snapshot, e.g. it is "reset" into whatever state was in the snapshot, if you want to preserve the state, you would need to capture the snapshot after a() was invoked and run on top on that one instead

@FroMage
Copy link
Contributor Author

FroMage commented Jul 9, 2019

Well, remember CP was created to alleviate the lack of pipeline-local (equivalent to thread-local for reactive pipelines), so the taking a CDI snapshot rather than propagating the entire request scope makes it completely unusable. Again, please consider the actual use-cases rather than theoretical aspects around concurrent execution, which is not what we're trying to solve here.

@manovotn
Copy link
Contributor

manovotn commented Jul 9, 2019

That would mean we were wrong from the start, because even javadoc of ThreadContextSnapshot says An immutable snapshot of a particular type of thread context.

@manovotn
Copy link
Contributor

manovotn commented Jul 9, 2019

Snapshot being immutable makes a lot of sense IMO.
But what you propose could be implemented either via some config or some different API where basically any subsequent 'action' would capture snapshot from previous step.

I don't really see a fault with purely CDI context propagation for this issue. Like I said, other contexts are likely to suffer from similar problems as you described here for CDI simply due to immutable nature of snapshots.

@njr-11
Copy link
Contributor

njr-11 commented Jul 9, 2019

The intent of documenting ThreadContextSnapshot instances as immutable is reflected in the two sentences after it in the JavaDoc,

 * <p>The captured context represented by this snapshot can be applied to
 * any number of threads, including concurrently.</p>
 *
 * <p>Any state that is associated with context applied to a thread should
 * be kept, not within the snapshot, but within the distinct
 * <code>ThreadContextController</code> instance it creates each time it is applied
 * to a thread.</p>

If we consider what would happen if a ThreadContextSnapshot were to instead store state (about how to restore a previous context) and then the snapshot was applied to two threads at once, that information would be overwritten and it would become impossible to properly restore one of the threads to its previous state after the task/action completes. This was the original reason for the language stating immutable.

That said, it is still quite valuable for most context types to have the context itself behave as immutable. A good example is Security context. When you create a completion stage via MP Context Propagation with Security context propagation enabled, you are intending for the action to run as the user (USER1) that is currently on the thread. If you happen to log in a different user (USER2) after that point (or within a prior stage to which you also propagated context which runs first), you still want the stage that you created under USER1 to run as USER1, not a different user.

Here's an example of the above,

running as USER1
stage2 = stage1.thenRun(() -> { ... do stuff, then log in as USER2 & do more stuff });
stage3 = stage2.thenRun(() -> { /* this better be running as USER1 */ });

There are other context types, however, such as Transaction that could be viewed as semi-immutable. Transaction context can be considered immutable in that the snapshot is of a particular transaction ID which never changes. It is that exact transaction id which is re-established on the thread where the completion stage action runs. The transaction's state, however, can vary, and individual stages have the ability to mark it rollback-only or outright commit it or roll it back, and it is up to the application developer to ensure to do the right thing and coordinate between stages. This sort of context propagation might be thought of as an immutable pointer to the actual transaction, whose state can change.

If I'm understanding the debate here, it's about which model CDI context propagation ought to follow.

@FroMage
Copy link
Contributor Author

FroMage commented Jul 11, 2019

Right. Spot on with the Transaction context.

The fundamental property of the CDI Request Context, like its Application Context, is that it's a mutable map, where lazy values can appear, and which must be destroyed at the end of its lifecycle. Those two contexts only differ in the span of their lifecycle, who can view them (everyone vs a single request), and how they are implemented (but that's irrelevant to their behaviour).

At the moment, the CDI ThreadContextProvider uses snapshots of values to capture/restore not the request context, but its current state, which breaks lazy values and destroy semantics (there's never an end for them since they can be captured). I don't think this is the right way to do it, and it breaks so many things and semantics that it's hard to argue it should be.

Perhaps it's indeed the snapshot immutability text that needs to be changed to explain those two models?

@njr-11
Copy link
Contributor

njr-11 commented Jul 11, 2019

I was reading through what we currently have documented for CDI context, as well as experimenting with our current implementation (based on Weld) to see how it behaves in this area.

The only discussion of the meaning of CDI context appears on the JavaDoc for the CDI constant on ThreadContext,

     * CDI context propagation includes request, session and conversation contexts.
     * When CDI context is propagated, all of the above mentioned contexts that are
     * currently active will be available to the contextualized task with preserved 
     * state. <pre><code>
    ManagedExecutor exec = ManagedExecutor.builder()
        .propagated(ThreadContext.CDI, ThreadContext.APPLICATION)
        .build();
    
    &#64;Inject
    MyRequestScopedBean requestBean;
    
    &#64;GET
    public void foo() {
        exec.supplyAsync(() -&gt; {
            String state = reqBean.getState();
            // do some work with the request state
        }).thenApply(more -&gt; {
            // request state also available in future stages
        });
    }

It uses the somewhat ambiguous wording "preserved state" rather than a more precise term like "snapshot of state as of the point in time when context was captured" or "shared state across all operations participating in the request". The comments in the code example that follow do seem to hint at preserving state across multiple completion stage actions, but aren't explicit enough. There is sufficient ambiguity for an implementation to interpret it either way, and we should definitely clarify this section of doc as well as the language around snapshots.

Our immediate concern should be around whether our implementations are doing the best thing for users.

When I look into the current behavior of our Weld-based implementation of CDI context, I'm seeing similar results to what @FroMage was pointing out. In two separate requests I tried the following,

Request 1:

        requestBean.getState();
        CompletableFuture<Void> cf = appCDIExecutor.runAsync(() -> {
            requestBean.setState(requestBean.getState() + ",1");
        }).thenRunAsync(() -> {
            requestBean.setState(requestBean.getState() + ",2");
        }).thenRunAsync(() -> {
            requestBean.setState(requestBean.getState() + ",3");
        });
        cf.join();

        System.out.println(requestBean.getState()); // prints DEFAULT_VALUE,1,2,3

Request 2

        CompletableFuture<Void> cf = appCDIExecutor.runAsync(() -> {
            requestBean.setState(requestBean.getState() + ",A");
        }).thenRunAsync(() -> {
            requestBean.setState(requestBean.getState() + ",B");
        }).thenRunAsync(() -> {
            requestBean.setState(requestBean.getState() + ",C");
        });
        cf.join();

        System.out.println(requestBean.getState());

The second request (bean not used first) is outputting "DEFAULT_VALUE" rather than "DEFAULT_VALUE,A,B,C".

In my opinion the behavior of request 2 above is counterintuitive, is a less useful behavior, and likely to make writing reactive applications more difficult. @manovotn do you know if it would be possible with Weld to achieve the opposite behavior and directly access the request from the other threads rather than operating on a copy of it? Are there other side-effects here that I'm not thinking of? Or other reasons why it would be undesirable to switch it?

@Emily-Jiang
Copy link
Member

What @manovotn said is absolutely true from CDI point of view (immutable copy propagation). For context propagation, it was made to a copy not sharing. I did see what @FroMage and @njr-11 are trying to demonstrate look less useful as the chain is revert back to beginning.

I am wondering whether Weld can be updated to propagate the context in this fashion:

Now: 1->2->3. It seems 1->2 and then 1->3
New: 1->2->3. Can we make 1->2, (a new Request Context created) 2->3 instead
What do you think @manovotn ?

@manovotn
Copy link
Contributor

@njr-11 directly accessing request context won't work because it operates on thread locals. Also the original task is backed by actual HTTP request which serves as a storage for beans (and we have local cache to fasten things up for repeated queries); any subsequent propagation means you are on a thread that no longer has access to HTTP request and no way of linking that. This in practice means we cannot update the actual storage even if we wanted to.

So how it works:
Original thread -> uses HTTP request as storage + cache on top
Some thread after propagation -> new request context is created, plain Map as storage, no links to HTTP request

It goes along the same lines for session and conversation scopes just with HTTP session instead.
Any change you do in other threads won't be able to update the HTTP request/session hence leading to inconsistent stages. Aside from that sharing the whole maps present a danger in clearing the map from one thread where others still use it since request ended for whatever reasons there.

Why don't we instead discuss that ThreadContext should capture new context after each 'action' and use it for the following one? That seems to be what you are after and would work even with the current immutable design plus allowing to actually use immutable stages repeatedly if we decide to allow that.
E.g. this would achieve what @FroMage and @njr-11 both mentioned, right?
return threadContext.withContextPropagation(CompletableFuture.completedFuture(null)) //captures state 1
.thenAccept(v-> a()) // runs with state 1 and at the end captures state 2
.thenAccept(v -> b()); // runs with state 2 and at the end captures state 3

I think both our impls override the Futures already and can manipulate it in some way?

@Emily-Jiang
Copy link
Member

@manovotn I think propagate context after each action meets my expectation as well, since we are fixing the problem of relaying instead of going back to the starting point.

@manovotn
Copy link
Contributor

Note that what I suggested above doesn't solve the problem with things like @PreDestroy callbacks which currently 'don't work' with Weld (or Arc in Quarkus). Only the initiating thread can invoke them upon context destruction. All subsequent/parallel threads then only deactivate contexts but never destroy them (so never invoke pre destroy) as that would lead to awkward situations and inconsistent stages (instance pre destroy callback invoked and you can still use it in another thread).

I am mentioning this because it is what @FroMage originally used when coming up with this issue.
Even if propagation works the way that was suggested here, the problem with pre destroy would still exist.

@njr-11
Copy link
Contributor

njr-11 commented Jul 12, 2019

Thanks for the replies - I was worried about the possibility of additional complications.

Unfortunately, the approach of re-capturing context between stages isn't workable because the completion stage programming model allows for stages to depend upon multiple other stages. The linear model where each stage follows exactly one stage is an oversimplification. For example, in a scenario where an actionD depends on two prior stages,

stageB = stageA.thenRunAsync(actionB);
stageC = stageA.thenRunAsync(actionC);

It would be wrong for

stageB.runAfterBoth(stageC, actionD);

to apply the CDI context for actionD differently than,

stageC.runAfterBoth(stageB, actionD);

We cannot have the former re-capturing CDI context from actionB and the latter from actionC.

@FroMage
Copy link
Contributor Author

FroMage commented Jul 15, 2019

Yes, I'm not sure how we can turn an implementation of CDI scope context propagation from an immutable context copy to a shared mutable context without just implementing context propagation as a shared mutable context.

As far as the spec is concerned, we could clarify that both styles of context propagation are possible (mutable or immutable).

As far as the implementation of CDI context propagation, the issue is that the current behaviour does not work as expected, because:

  • you can have multiple versions of the same request bean loaded during a request
  • none of those are destroyed (their @PreDestroy lifecyle method is never called)

Because of that, we're going to change the implementation of ArC context propagation to propagate the request context entire, as it mutates, and tell users to not attempt to modify it in parallel, or take care of synchronisation themselves.

We can make the request context itself thread-safe (just as the application context is thread-safe), but as the CDI team pointed out, this won't magically make user request beans thread-safe themselves. While user application beans are traditionally written to be thread-safe.

So it's not enough to make the request context thread-safe, but it at least allows users to access it from parallel threads if they want to (which we don't encourage).

Now, as far as the Weld context propagation is concerned, I'm not sure what's best. I don't think the current behaviour works well in either parallel or reactive settings, so I'm not sure it should stay as-is. But that's not necessarily a spec issue, unless we start specifying what context providers should do, which I'm not sure is a good idea.

@aguibert
Copy link
Contributor

From a user perspective I am in favor of the behavior where we snapshot the pointer to the context rather than snapshotting the CDI contextual instances that have been activated at the time the snapshot is taken. Behaviorally, this code example @njr-11 posted earlier:

        CompletableFuture<Void> cf = appCDIExecutor.runAsync(() -> {
            requestBean.setState(requestBean.getState() + ",A");
        }).thenRunAsync(() -> {
            requestBean.setState(requestBean.getState() + ",B");
        }).thenRunAsync(() -> {
            requestBean.setState(requestBean.getState() + ",C");
        });
        cf.join();

I think should give the results: DEFAULT_VALUE,A,B,C


From an implementation side, I don't think this is possible given the current Weld APIs. In order for this to work, I think we would need an API that can set/get the entire BoundBeanStore (the object containing the collection of ContextualInstances). This could be as simple as making AbstractBoundContext.getBeanStore() and AbstractBoundContext.setBeanStore(BeanStore) public methods.

I was able to hack into the set/getBeanStore methods via setAccessible(true) and the above test case gave us the desired behavior. However, I'm not sure if this will have any undesirable side-effects. CC @manovotn

@manovotn
Copy link
Contributor

Current APIs in Weld totally don't allow that as they weren't designed that way in the first place.
We probably misunderstood each other in expectations there.

As for the approach @aguibert tried, that's a step closer, here are some thoughts from the top of my head:

  • Operating on Bound context and storages only covers some cases, for instance req. context activated through interceptor (CDI spec link) will be unbound
    • Not sure we care about this scenario here, I'd say we don't really
  • What bound really means is that it is usually linked to actual HTTP request/session which operates as a storage; sharing that means you allow potential parallel access to that and I have no idea if it's safe?
  • Bean stores are caches which hold maps of beans for easy access and then write-through to the HTTP request - all these maps would have to be ConcurrentHashMap at all times which brings overhead (most apps don't do or care for context propagation at all)
    • Even with that you aren't really thread-safe though; that wouldn't matter for sequential execution like Stephane brought up but it sure does for parallel if you for instance attempt to put bean into storage from two threads

To be fair I am not sure how to adapt Weld to this. There is tons of complexities and optimizations and several layers of caching. And MP CP not being clear on what use cases are supported doesn't help either. I can see sequential scenario (reactive use case Stephane brought up here) as sensible and doable at some cost. But other use cases seem too crazy and dangerous if we were to share the whole storage for propagation...

@njr-11
Copy link
Contributor

njr-11 commented Jul 16, 2019

* Bean stores are caches which hold maps of beans for easy access and then write-through to the HTTP request - all these maps would have to be `ConcurrentHashMap` at all times which brings overhead (most apps don't do or care for context propagation at all)

That's certainly understandable that we would want to avoid the overhead when context propagation isn't used. What about using the request for context capture (which is always synchronous) to lazily replace the particular request scope's HashMap with ConcurrentHashMap?

* Even with that you aren't really thread-safe though; that wouldn't matter for sequential execution like Stephane brought up but it sure does for parallel if you for instance attempt to put bean into storage from two threads

If I understand correctly here, you mean 2 threads operating on the same request context both lazily access a bean for the first time, causing both to create a new instance and one overwrites the other's attempt to store and both continue using their own copy unaware of the other. Could this be addressed by replacing map.get/check null/map.put(newInstance) with map.putIfAbsent(newInstance) ? Or map.computeIfAbsent(key, functionToCreateNewInstance) if creating 2 instances and discarding one is itself a problem? Both of these operations are available on the standard Map API as of Java 8 and would be available regardless of whether the underlying hashmap is concurrent or not.

Would this be doable? I'm not familiar with the CDI impl so I don't know how pervasive of changes these would be.

@manovotn
Copy link
Contributor

a new instance and one overwrites the other's attempt to store and both continue using their own copy unaware of the other

Eventually the map would have the correct one but you already created two beans which might communicate with other beans (due to dependencies) and which might have post constructs that already triggered twice.

map.putIfAbsent(newInstance)

That was my first thought as well but unfortunately, that won't do it. You still already created two instances. And if you try to use computeIfAbsent() then there is a problem that creating a bean might trigger creation of its dependencies (another req. scoped bean) which will, yet again, need to be stored. And concurrent hash map isn't reentrant.

@aguibert
Copy link
Contributor

today on the weekly hangout @manovotn, @njr-11, and I discussed a possible solution for this problem that combines a few ideas:

  • Currently we store all ContextualInstances that have been registered at the point the context snapshot is taken. This works, but it does not account for more ContextualInstances being lazily registered as future stages execute.
  • One idea was to eagerly discover all ContextualInstances and register them when a snapshot is taken. This could lead to server performance issues because it would trigger all beans in the app to be initialized and have their @PostConstruct methods called (which could be very expensive)

Proposed solution

Eagerly register handles to all ContextualInstances (e.g. Future<ContextualInstance>) so that initialization/@PostConstruct is not triggered for all beans at the time of snapshot. If new beans are accessed during subsequent stages, the handles would be populated. Beans that had stored handles but were never accessed would not cause significant performance overhead.

@manovotn
Copy link
Contributor

Reading over this again, here are few things coming to my mind:

  1. Assuming we had this solution, it would still not solve the issue with @PreDestroy not working correctly, right?
    1.1 Any thread into which you propagate instances still lacks the information on HTTP request and so you would need to call predestroy when the original thread ends the request which may or may not be correct
  2. The lazy value holder would need to be thread-safe construct that we pass around, but the population of the map with them won't need to as that can be done eagerly on the "original thread"
    2.1 As a side note, Weld already uses LazyValueHolder construct could(?) be reused as it has thread-safe get()
  3. This approach has the overhead of creating tons of objects with no actual use, especially when no context propagation happens
    3.1 E.g. for an application with a possible maximum of 100 requests beans, you will have to have 100 lazy value holders despite the fact that in a given request user might need just 0-3 beans from that context
    3.2 We probably couldn't reuse holders in between requests, so that's even worse if many requests keep coming in - scaling of this solution will be bad
  4. Implementing this would require not only to change the storages to account for lazy values but also some changes to how generated proxies get hold of actual instances
  5. Also one more thing regarding storing beans as HTTP request params; I found some bits and pieces in JSF 2.3 spec (chapters 5.9.2 and 5.6.3 and 6.1.4) - you are able to access the map of beans via either EL, or though external context (externalContext.getRequestMap()).
    5.1 These are stored just in form of an ID (naming scheme and bean ID) to Object map, this approach would then store lazy value holders as well leading to much more attribute set/remove calls for every single request
    5.2 I am trying to poke around in specs to figure out if we are anyhow obliged to store contextual instances directly there, but so far I haven't found anything like that (in fact CDI doesn't even say you should do that, it was just designed that way) - I don't think this should be changed, I just explore if we must store them in some format (plain instance versus instance in "holder")

@FroMage
Copy link
Contributor Author

FroMage commented Jul 19, 2019

1.1 Any thread into which you propagate instances still lacks the information on HTTP request and so you would need to call predestroy when the original thread ends the request which may or may not be correct

Only the creator of the "request" can know when it ends. In the case of HTTP requests, only the container knows when it will end (sync or async) and only it can terminate it and notify of the ability to destroy things attached to it. It also has that responsability: it must do it.

So trust that it does and stop worrying about it ;)

Also, any thread still using the request when it has been terminated should expect bad consequences. This is not something that should be supported.

@mkouba
Copy link

mkouba commented Jul 19, 2019

CRAZY IDEA: for the reactive use case we could also COPY the contextual instances from one stage to the next one (i.e. something like 1->2, 2->3, 3->4 mentioned above) AND also propagate a consumer of newly created bean instances so that WHEN the original request context should be destroyed we could destroy the original context + iterate over all newly created beans and destroy them appropriately.

In theory, this should be much easier to implement.

@aguibert
Copy link
Contributor

@mkouba I like that idea, it seems implementable without requiring any changes to WELD too which is nice

@njr-11
Copy link
Contributor

njr-11 commented Jul 22, 2019

COPY the contextual instances from one stage to the next one (i.e. something like 1->2, 2->3, 3->4 mentioned above)

This was discussed a bit earlier. The problem is that the CompletionStage programming model allows more than just a linear progression. One completion stage can depend upon the completion of a couple of different stages. For example, you might have 1->2; 1->3; (2,3)->4; 1->5; (4,5)->6

I'm not too concerned about the destroy aspects of this and agree with the comment from @FroMage about it being a user error if they arrange stages that might run after the scope has ended.

@manovotn
Copy link
Contributor

Yes, Martin stated that would be for the 'reactive' approach which is the sequential (or linear of you will).

@njr-11
Copy link
Contributor

njr-11 commented Jul 23, 2019

Yes, Martin stated that would be for the 'reactive' approach which is the sequential (or linear of you will).

The approach where completion of one stage depends on the completion of a couple of other stages is every bit as much reactive. I am not in favor of a solution that only works properly for a favored subset of reactive patterns when the programming model itself (CompletionStage) provides and promotes other patterns that would be busted by that solution.

@njr-11
Copy link
Contributor

njr-11 commented Jul 23, 2019

Given that there doesn't appear to be any solution to this in the near term, I'm thinking of taking the following approach for our product to prevent users from getting into unexpected situations: Continue using the provided Weld interfaces as is, but detect when a bean is for the first time within a request by a completion stage action and fail the stage with an error indicating to the user that the bean must be accessed prior to capturing the context/creating the completion stage.

This will hopefully avoid applications writing code that gets themselves into trouble with CDI context propagation and allow for a better solution in the future, which we can continue working on under this issue.

@manovotn
Copy link
Contributor

but detect when a bean is for the first time within a request by a completion stage action and fail the stage with an error indicating to the user that the bean must be accessed prior to capturing the context/creating the completion stage.

That's every bit as user-hostile as leaving it in its current state. Users don't necessarily know all the beans they use and their dependencies through and through (coming from 3rd party lib for instance). They may not even have the power to access some of them directly due to that.

With that mindet we might as well just do the eager creation approach and say you shouldn't rely on post construct or initializer methods. You would get a sensible propagation (since you would have a complete set of beans passed around) though at a cost of limitations and some overhead on instance creation which may or may not be huge depending on how many beans you have there and how bestial they are.

@njr-11
Copy link
Contributor

njr-11 commented Jul 23, 2019

It may just be a difference in philosophies. Although the user does not like to experience limitations or failures, I believe we are doing more of a service to users in alerting them upfront and even limiting function in order to prevent the possibility of data inconsistency that can happen without their awareness.

I like the eager creation approach and the associated idea with storing references that Andy proposed, and I see the approach of raising errors in the near term as a step toward being able to adopt the eager creation/references approach in the long term. I get the impression that it will require some in-depth changes in Weld. Being able to do something more limited in the near term (detect & raise an error about inconsistent data), while remaining fully compatible with future direction seems like a good strategy to take so that we can get GA compatible implementations out there for users to start taking advantage of for the scenarios that are currently stable/possible.

@beikov
Copy link

beikov commented Jul 31, 2019

I wanted to share my use case and implementation strategy as I have attempted to implement this myself some time ago for parallel data loading.

I usually have named CDI producer methods that represent some contextual data e.g. the currently logged in user, a list of data in the language of the user.

@Named("currentUser")
@Produces
@RequestScoped
public User getUser(@Named("currentUserId") Integer userId) {
  return entityManager.find(User.class, userId);
}

@Named("selectableLanguages")
@Produces
@RequestScoped
public List<Language> getUser(@Named("currentUser") User user) {
  return Language.getLanguages(user.getLocale());
}

I enjoy these beans being lazily loaded, as I only do heavy processing when it's actually used and at the same time, I avoid doing heavy processing multiple times as the beans are only constructed once and then shared.

The only downside to this approach is that the first time the bean is accessed e.g. in the UI, the rendering is blocked, waiting for the heavy processing to finish. So I thought, since I usually know most beans that are going to be needed, I could try to pre-load these beans in parallel before actually starting rendering. This is when started experimenting with context propagation.

I implemented something that collected all bean names that are necessary for a request and prior to executing the rendering, loading all beans in parallel, causing data to be populated in the bean store such that the rendering part could just take the data without having to wait any further.

I wanted to contribute the context propagation parts as extension to DeltaSpike, but it wasn't accepted unfortunately, probably due to the hacks it used to get it working:
beikov/deltaspike@3a15d6c

The basic idea is, to capture context references that can be attached/detached to/from the current thread, such that the original bean store is mutated. I think this implements what @FroMage is expecting, which is also what I am expecting.
I'm not 100% sure if I solved the thread-safety issues, but I tried really hard, using a BoundBeanStore wrapper that was using a lock for the request context: beikov/deltaspike@3a15d6c#diff-a4a98500d63ce220b5e1d71004fcbff0R92

I hope my use case and DeltaSpike implementation helps you and I would really like to see context propagation done with a shared mutable bean store to enable the mentioned use cases.

@manovotn
Copy link
Contributor

@beikov thanks for sharing the use case and the original DS contribution.
Glancing at the solution, I don't think it would be thread-safe because while you synchronize the access, it doesn't solve the problem of two threads creating the contextual bean instance (triggering post constructs) and then both queuing up for storage - one of them uselessly. The BeanStore itself though offers some locking. However that is only used for Application context ATM as req./sess./conv. weren't designed to be thread safe.

Also note that BoundBeanStore is actually just a cache. Down the line the bean are stored into HTTP request/session. You can see that for instance of you browse a bit around RequestBeanStore or more precisely AttributeBeanStore.

Furthermore it would need some way (config key?) to switch behaviour so that users who don't care about context propagation won't have to run on thread safe structures and concurrent collections which are generally slower than their non-thread-safe counterparts.

If you are brave enough and have some time on your hands, you can try and compose something for Weld. PRs are most welcome ;-)

@beikov
Copy link

beikov commented Jul 31, 2019

Good to know that this doesn't solve the issue fully. My use case fortunately isn't affected by that as it doesn't initialize request scoped beans concurrently.

Furthermore it would need some way (config key?) to switch behaviour so that users who don't care about context propagation won't have to run on thread safe structures and concurrent collections which are generally slower than their non-thread-safe counterparts.

Up until scheduling the first runnable, the bean storage can stay unsynchronized though. You will only have to replace the bean storage with a synchronized version later and so AFAICS you can avoid the cost for users that don't require the request scope to be propagated or mutable.

If you are brave enough and have some time on your hands, you can try and compose something for Weld. PRs are most welcome ;-)

I would be brave enough, but am lacking time currently. I still hope my use case and experiment encourages someone to work on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants