-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Programatically filter out tool-induced fences from the sampler tool utility #194
Conversation
Adding tool-invoked_fence function. Note that fence function is is doing global fencing, and that the end_parallel_* will need a devID if doing per-device fencing in the future.
…device ID for fence (which could be obtained through the space handle).
…device ID for fence (which could be obtained through the space handle). Fixing with formatting.
…terface function. Still need to get device ID for fence (which could be obtained through the space handle). Fixing with formatting.
This is a example run and output on MacBook using g++ (default) and Kokkos serial host backend for stream, without and with Kokkos Tools global fencing. Kokkos Tools Global Fence turned off (no auto-fencing)
Kokkos Tools Global Fence (auto-fencing) on
|
// using Kokkos::Tools::Experimental; | ||
// using mytpi_type = Kokkos::Tools::Experimental::ToolProgrammingInterface; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove all commented lines you are not using.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
@@ -82,9 +125,7 @@ void kokkosp_init_library(const int loadSeq, const uint64_t interfaceVer, | |||
printf("KokkosP: Next library to call: %s\n", nextLibrary); | |||
printf("KokkosP: Loading child library ..\n"); | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's up with all these whitespace changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will fix, making sure the code file is indeed getting processed through clang-format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@masterleinad Does the new file committed address the issues you have with whitespace changes?
Can you please summarize and interpret the results? |
Co-authored-by: Daniel Arndt <[email protected]>
@masterleinad The result shows output of the test of the proposed new sampler with optimization to fencing, in the case that global fencing is on (KOKKOS_TOOLS_GLOBALFENCES=1) and the case that global fences is off (KOKKOS_TOOLS_GLOBALFENCES=0). The output shows that given the sampling skip rate is 105, only 46 kernel invocations out of the 1000 have a Kokkos Tool child event for the Kokkos Tools connector specified (here simple-kernel-timer) dispatched. One can intrepret this as the sampler behaving correctly, in both cases. Perhaps a better test case here is the output from the version of the sampler in Kokkos Tools |
Removing all commented code (which is no longer needed) as requested.
For reference, here is an output of a build and compilation on a MacBook Pro (2.9 GHz Quad-Core Intel Core i7)
|
I am aware of the build test with Kokkos that's in the CI that is failing. I don't know if that is the reason for the problem. I am putting this back into a draft PR to figure things out from my end. |
Note that the device identifier used in the |
@masterleinad Thanks for that. I will take a look at that part of the interface. |
More with hashmap and nestedID coming
mistakenly made this change in this PR but it is part of PR kokkos#194
…track of nested (child) kernel ID on sample.
Fix #179.