-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BN FWD INFER OCL to HIP #3106
base: develop
Are you sure you want to change the base?
BN FWD INFER OCL to HIP #3106
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main concern is missed half
data type test. Other comments are mostly C++ related and not very critical.
Also clang-format should be applied to the files. And can you take a look into this draft https://gist.github.com/CAHEK7/9ed75ca55acf85f92efd89eb2446a94d and adopt it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks fine, except a few tiny typos.
Only test and perf data are required to get it merged.
Could you adjust verification algorithm and thresholds and attach perf measurements data?
@sgundabo PR description is missing some important info. Please provide:
|
We can't do anything with it, except deprecating HIP and get back to OCL. |
@sgundabo could you move this PR to ready for review? |
@junliume we are waiting for the perf metrics |
Raw Perf Data HW info: gfx90a |
It can be safely merged since it does not affect production code. I just don't want to lose this PR. |
Restarted CI for sanity checks |
This PR focuses on converting the Batch Norm Forward Inference Spatial and Per Activation kernels from OpenCL to HIP. This conversion is a part of the broader initiative to translate all OpenCL kernels within MIOpen, as the OpenCL backend has been deprecated.
Ensuring correctness:
The PR includes a GTEST that compares the output of the OpenCL kernel with the HIP implementation. The test cases are derived from the existing batch norm forward inference kernel GTEST.
Ensuring GPU Performance parity:
The GTEST also measures the minimum, maximum, mean, median, and standard deviation of the kernel execution time across five runs and records the data in a CSV file. This data is used to create graphs that illustrate the average performance improvement of the HIP implementation over OpenCL. An average performance gain greater than one is considered favorable.
TODO: Collect perf metrics on a wider variety of test cases on a gfx90a, and ensure parity.
Ensuring Host side Performance parity:
As the OpenCL backend support is deprecated in MIOpen, the assumption is that this decision was made while being aware of the compilation overhead of HIP kernels over OpenCL.