Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a thread pinning mechanism #309

Merged
merged 7 commits into from
Nov 21, 2024
Merged

Introduce a thread pinning mechanism #309

merged 7 commits into from
Nov 21, 2024

Conversation

PeterTh
Copy link
Contributor

@PeterTh PeterTh commented Nov 18, 2024

This is a followup to #303.

As the backend executor requires low-latency communication with the submission threads, it is advantageous for these to be close in terms of cache hierarchy. On larger systems, we've observed another 50%+ performance increase due to this change in extreme situations.

In terms of interface design, the goal was to provide a very simple entry point (pin_this_thread), that is safe to use from any thread at any time, and does not require polluting any other modules with state related to thread pinning.
Any pinning of the user thread also needs to be safe in terms of restoration, and the whole mechanism needs to work in the presence of process-level affinity masking (e.g. from MPI).

This PR also:

  • Removes the ad-hoc core number check and replaces it with a better mechanism
  • Adds extensive unit testing for this tool
  • Pins the threads of non-runtime (micro-)benchmarks, to hopefully increase result consistency
  • Takes this opportunity to improve documentation, unifying and fixing missing installation documentation, and providing full configuration documentation in a separate file

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

@coveralls
Copy link

coveralls commented Nov 18, 2024

Pull Request Test Coverage Report for Build 11953648017

Details

  • 181 of 193 (93.78%) changed or added relevant lines in 8 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.2%) to 94.802%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/affinity.cc 41 42 97.62%
src/platform_specific/affinity.unix.cc 115 126 91.27%
Totals Coverage Status
Change from base Build 11912395682: -0.2%
Covered Lines: 6898
Relevant Lines: 7011

💛 - Coveralls

Copy link
Contributor

@fknorr fknorr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love that we're finally integrating this into Celerity itself, and that you managed to do so without involving hwloc!

It would be neat to also pin threads for non-system benchmarks - AFAICT this should be as simple as keeping a thread_pinner instance in every benchmark_context class. Edit: benchmark_context is intantiated for every benchmark run, so the pinner must live outside that to avoid including the pinning / thread migration itself in the measurements.

include/affinity.h Outdated Show resolved Hide resolved
src/affinity.cc Outdated Show resolved Hide resolved
src/config.cc Outdated Show resolved Hide resolved
src/platform_specific/affinity.unix.cc Outdated Show resolved Hide resolved
src/platform_specific/affinity.unix.cc Show resolved Hide resolved
src/platform_specific/affinity.unix.cc Outdated Show resolved Hide resolved
src/affinity.cc Show resolved Hide resolved
src/platform_specific/affinity.unix.cc Outdated Show resolved Hide resolved
src/platform_specific/affinity.unix.cc Outdated Show resolved Hide resolved
test/CMakeLists.txt Show resolved Hide resolved
@psalz psalz added this to the 0.7.0 milestone Nov 19, 2024
github-actions[bot]

This comment was marked as duplicate.

Copy link
Member

@psalz psalz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff! A couple of notes.

Also, I remember we talked about whether we should detect whether the application thread is already pinned, and not do anything in that case. Did you end up deciding against that?

include/affinity.h Outdated Show resolved Hide resolved
include/affinity.h Outdated Show resolved Hide resolved
include/affinity.h Outdated Show resolved Hide resolved
include/affinity.h Outdated Show resolved Hide resolved
include/affinity.h Show resolved Hide resolved
test/affinity_tests.cc Outdated Show resolved Hide resolved
test/affinity_tests.cc Show resolved Hide resolved
test/affinity_tests.cc Outdated Show resolved Hide resolved
src/platform_specific/affinity.unix.cc Show resolved Hide resolved
src/platform_specific/affinity.unix.cc Outdated Show resolved Hide resolved
include/affinity.h Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
ci/run-benchmarks.sh Show resolved Hide resolved
docs/configuration.md Outdated Show resolved Hide resolved
docs/configuration.md Outdated Show resolved Hide resolved
docs/configuration.md Outdated Show resolved Hide resolved
src/affinity.cc Outdated Show resolved Hide resolved
test/affinity_tests.cc Outdated Show resolved Hide resolved
docs/installation.md Outdated Show resolved Hide resolved
test/affinity_tests.cc Outdated Show resolved Hide resolved
test/affinity_tests.cc Outdated Show resolved Hide resolved
Copy link

Check-perf-impact results: (239fbb90e2d616f85d4c17b897fc3c28)

🚀 Significant speedup (<0.80x) in some microbenchmark results: 17 individual benchmarks affected

Relative execution time per category: (mean of relative medians)

  • command-graph : 0.98x
  • graph-nodes : 1.03x
  • grid : 0.99x
  • instruction-graph : 0.98x
  • scheduler : 0.89x 🚀
  • system : 0.82x 🚀
  • task-graph : 0.97x

@PeterTh
Copy link
Contributor Author

PeterTh commented Nov 20, 2024

All updates should now be in, including the previously discussed (but not implemented) feature which prevents the application thread from being pinned if it has already had its affinity mask modified by something else (+test).

src/platform_specific/affinity.unix.cc Outdated Show resolved Hide resolved
src/affinity.cc Outdated Show resolved Hide resolved
src/platform_specific/affinity.unix.cc Show resolved Hide resolved
src/platform_specific/affinity.unix.cc Outdated Show resolved Hide resolved
src/platform_specific/affinity.unix.cc Outdated Show resolved Hide resolved
src/platform_specific/affinity.unix.cc Show resolved Hide resolved
include/affinity.h Outdated Show resolved Hide resolved
github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as duplicate.

@PeterTh PeterTh merged commit 28eadd6 into master Nov 21, 2024
17 checks passed
@fknorr fknorr deleted the thread-pinning branch November 21, 2024 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants