Skip to content

Commit

Permalink
kp_sampler_skip.cpp: put begin for callee check before fence
Browse files Browse the repository at this point in the history
This improves performance in the case there is no callee for the kokkosp_begin_parallel_for.

This is actually done correctly in the kokkosp_begin_parallel_scan and begin_parallel_reduce.
  • Loading branch information
vlkale authored May 11, 2024
1 parent dd39767 commit 69221e5
Showing 1 changed file with 4 additions and 3 deletions.
7 changes: 4 additions & 3 deletions common/kokkos-sampler/kp_sampler_skip.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -200,10 +200,11 @@ void kokkosp_begin_parallel_for(const char* name, const uint32_t devID,
std::cout << "KokkosP: sample " << *kID
<< " calling child-begin function...\n";
}
if (tool_globFence) {
invoke_ktools_fence(0);
}

if (NULL != beginForCallee) {
if (tool_globFence) {
invoke_ktools_fence(0);
}
uint64_t nestedkID = 0;
(*beginForCallee)(name, devID, &nestedkID);
if (tool_verbosity > 0) {
Expand Down

0 comments on commit 69221e5

Please sign in to comment.