-
Notifications
You must be signed in to change notification settings - Fork 300
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Use NVTX filtering to limit NCU profile collection
Summary: Previously, we used `--replay-mode range`, but that did not give us per-kernel metrics, so it was changed to `---replay-mode kernel` (the default). However, that can causes us to profile a lot more kernels outside the ones in the desired benchmark. It appears we can instead use NVTX filtering to solve this problem. Relevant docs: https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html#nvtx-filtering I also tacked on a minor change to the ncu invocation, adding `--import-source yes`. This makes it easier to analyze the traces on a different machine from the one doing the profiling. Reviewed By: chenyang78 Differential Revision: D58711358 fbshipit-source-id: 28aec4f71a736c7427b1886335297ece4a2a54a8
- Loading branch information
1 parent
62e2609
commit b2b4158
Showing
2 changed files
with
34 additions
and
23 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters