-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce a thread pinning mechanism #309
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
Pull Request Test Coverage Report for Build 11953648017Details
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love that we're finally integrating this into Celerity itself, and that you managed to do so without involving hwloc!
It would be neat to also pin threads for non-system benchmarks - AFAICT this should be as simple as keeping a Edit: thread_pinner
instance in every benchmark_context
class.benchmark_context
is intantiated for every benchmark run, so the pinner must live outside that to avoid including the pinning / thread migration itself in the measurements.
653a31a
to
749bc6d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good stuff! A couple of notes.
Also, I remember we talked about whether we should detect whether the application thread is already pinned, and not do anything in that case. Did you end up deciding against that?
7ab11fa
to
7a3293e
Compare
7a3293e
to
c09ae63
Compare
c09ae63
to
7aefc24
Compare
Check-perf-impact results: (239fbb90e2d616f85d4c17b897fc3c28) 🚀 Significant speedup (<0.80x) in some microbenchmark results: 17 individual benchmarks affected Relative execution time per category: (mean of relative medians)
|
7aefc24
to
1f6c223
Compare
All updates should now be in, including the previously discussed (but not implemented) feature which prevents the application thread from being pinned if it has already had its affinity mask modified by something else (+test). |
1f6c223
to
11a76ef
Compare
…th everything pinned
11a76ef
to
6b39952
Compare
This is a followup to #303.
As the backend executor requires low-latency communication with the submission threads, it is advantageous for these to be close in terms of cache hierarchy. On larger systems, we've observed another 50%+ performance increase due to this change in extreme situations.
In terms of interface design, the goal was to provide a very simple entry point (
pin_this_thread
), that is safe to use from any thread at any time, and does not require polluting any other modules with state related to thread pinning.Any pinning of the user thread also needs to be safe in terms of restoration, and the whole mechanism needs to work in the presence of process-level affinity masking (e.g. from MPI).
This PR also: