-
-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
don't use safe caller for event delegation to subscribers #861
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Afair, the safeCaller does not use a single thread, but a pool of x threads. Are we safe here to reduce it to a single thread or might we run into some other issues?
I also remember a feature that we assigned events to bindings and made sure that if one binding exhausts all threads of a pool, that this does not negatively impact the others. I cannot find this in the code right now, so I assume this was not part of the internal event handling itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Events should be handled in order.
Using multiple threads for the event handling can result into a different event order.
Assume changed events "1->2", "2->3".
If you are using different threads how do you want to ensure that the submitted execution does not change the order because the one thread comes in front of the other?
E.g. you submit the event handling for "1->2" to the executor, after that you submit the event handling for "2->3" to the executor.
The executor uses thread1 for "1->2" and thread2 for "2->3".
Now, sometimes thread1 is executed in front of thread2, other times thread2 is executed in front of thread1 (or they are executed at the same time using different CPUs, or the execution is switched between etc.).
The event order will be non deterministic anymore.
So, regardless how it has been done, it should be done in order.
IIRC this has been also discussed some years ago.
As written above already, we could interrupt the event subscriber if the limit has been exceeded.
But if you realize that an addon consumes to much resources, perhaps it would be better to fix that addon (and realize the bad behaviour).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to put one more data point out there...
On my test system (with a ~9 year old AMD CPU with each of 6 cores running at 3.7 GHz) I max out one core at ~90 events per second. I produce this load with 5 systeminfo things set to update all channels (except disk because I don't know the performance penalty of that function) at a refresh rate of interval_high=1, interval_medium=4.
The top cpu consuming threads are:
Overall, openHAB is consuming a little less than 2 CPUs worth of load.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maggu2810 We never had the requirement to respect a specific order and afair, older OSGi specs didn't ask for an ordered delivery either. I remember discussions where we found out that Equinox used single threaded delivery while Felix used a thread pool instead.
It seems that OSGi 7 has now introduced a specific setting for defining whether ordering is required or not.
Sure, but the idea was that its effect is constrained to its own functionality and does not impact others to misbehave at the same time. I remember situations where a binding took too long when consuming events and this resulted in late delivery of events to other bindings, which then showed very odd behavior and it was difficult to actually figure out, which binding was the culprit. Much better if only the one behaves badly, which actually has the bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, this PR is a dramatic improvement over the current implementation. If there are no issues other than the ability to scale beyond the rates I've outlined above, can't this be merged, as it resolves a very critical issue in the current snapshots.
If there's a desire to change the implementation to scale better, can't that be done in another PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a look at the code and the changes in this PR don't impact how commands are sent to
ThingHandler
s using theSafeCaller
:openhab-core/bundles/org.openhab.core.thing/src/main/java/org/eclipse/smarthome/core/thing/internal/profiles/ProfileCallbackImpl.java
Lines 71 to 75 in e875826
So commands will still be handled asynchronously which was my main concern. Commands will thus continue to be handled even if a couple of
ThingHandler
s hijack threads e.g. due to network timeouts or when waiting on locks.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FTR this is how I generated the event load.
https://github.com/mhilbush/openhab2-addons/tree/load-generator-binding/bundles/org.openhab.binding.loadgenerator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just about to ask @mhilbush to add some command handling to the load generator (with e.g. 300ms processing time of the handler).
But @wborn's observation that this had actually moved to the ProfileCallback should hopefully answer it - this should not be related to the event dispatching itself anymore. I consider my second concern to be clarified then.
Thank you all for the hard work and your patience with me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries. It's a change to a critical area, so I get the need for diligence.
I wrote the load generator in a hurry, so there's room for improvement. As I plan to use it in the future, and possibly extend it, what were you looking for here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A feature that makes the load generator not only create status updates, but also commands and then have the
handleCommand
method in the ThingHandler to block for e.g. 300ms to simulate some heavy work (like accessing external devices). This would then better simulate a setup where users can directly observe whether there is a delay in event processing or not - if switching on a device suddenly takes a few seconds, because the command hang in a single-threaded queue, it is instantly visible and annoying to the user. But as mentioned above: This PR should not have an issue on this behavior, so such an addition would only be interesting for future tests in other contexts.