-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add compatibility with HIP #289
Conversation
…his branch. Committer: Hanno Spreeuw <[email protected]>
…re why it works, but in the shell where you build make sure to execute 'export HIPCXX=hipcc'.
…out CUDA, not about HIP. For now, it is just completely commented out.
…nplace-perl.sh script. So this commit fixes almost all errors appearing from 'make -C build' from within cudawrappers-usage-example-locally-installed. First 'typedef uint32_t cuuint32_t;' is needed because there is no HIP equivalent of 'cuuint32_t'. We will probably need to 'shield' this with an #ifdef __HIP__ since this statement also appears in cuda.h. Next, there are a number of casts from const void* to void* because of differences in API between CUDA and HIP, i.e. differences in the types or the arguments. So you could consider these temporary fixes until these API differences have been removed; corresponding issues have been submitted to the ROCM repos. Two snippets of code involving context - includes Ctx - have been commented out for now, since I am not sure yet how the HIP equivalent of CUDA context works. 'hipFunction_attribute attribute' raised an error, while 'hipFuncAttribute attribute' worked, so this is perhaps a flaw in the perl script. There were also a few istances where the error messages - from 'make -C build' - suggested a replacement, while the perl script should have taken care of this. Since they worked, these suggestions were copied into this commit. Finally, I did not yet find a HIP equivalent of cuStreamBatchMemOp, so the code snippet involving this command was commented out, for now.
…he line 'typedef uint32_t cuuint32_t;'. It is 'protected' by '#ifdef __HIP__....#endif' because the same statement is in cuda.h. After discussion with csbnw: The code snippets with calls to cuStreamBatchMemOp and cuStreamGetCtx will remain commented out (and should probably be removed), since csbnw does not see any use for them. The error from the code snippet with void memcpyHtoHAsync(...){checkCudaCall(hipMemcpyAsync....)} was sorted out by loostrum: the direction of the copy is inferred when cuMemcpyAsync is used, but hipMemcpyAsync apparently cannot do this and the _obj stream argument is misinterpreted as a directional argument. Adding hipMemcpyDefault - as a third argument - fixes this. The other modifications are pure formatting comming from 'pre-commit run -a'.
…Found HIP replacement for CU_CTX_SCHED_BLOCKING_SYNC; inserting hipDeviceScheduleBlockingSync. Commented out all tests we could not easily replace with its HIP equivalents.
…this is to be debugged Because the test source files do not end in .hip, set_source_file_properties has to be used to enable compilation with hip instead of the host c++ compiler
… memory types, these are cuda-only for now
remaining: the FFT calls in HIP are declared __host__ __device__, so we must do the same however, doing so results in calls to host-only code (e.g. the FFT destructor), so a more intricate fix is required
…are __host__ __device__
* Added cu::Function::occupancyMaxActiveBlocksPerMultiprocessor() * Let target_embed_source inline local header files (#263) * Fix working directory for linking source into object file (#264) * Update readthedocs configuration (#265) * Fix target_embed_source when input file is a relative path (#266) * Removed deprecated cu::Context::setSharedMemConfig (#267) * Extend api (#268) * Add Device::getUUID * Add initial nvml wrapper * Fix target_embed_source (#269) * Fix target embed source paths (#270) * Fix path given to ld in target_embed_source * Fix missing arg in string replace * Fix CI (#271) * Add runtime subpackage * Fix symbols for test_nvrtc * [pre-commit.ci] pre-commit autoupdate (#259) * Update clang-format: v17.0.6 → v18.1.4 * Replace cmakelang with clang-format and cmake-format * Add nvrtc::findIncludePath (#272) * Make nvrtc::findIncludePath inline (#273) * Fix string replacements (#275) * Fix inline_local_includes (#276) * Make sure to select the correct INCLUDE_PATH (#277) * Fix typo (#278) * [pre-commit.ci] pre-commit autoupdate (#274) updates: - [github.com/pre-commit/mirrors-clang-format: v18.1.4 → v18.1.5](pre-commit/mirrors-clang-format@v18.1.4...v18.1.5) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Add inlined headers as dependencies in target_embed_source (#279) * Add inlined headers as dependencies in target_embed_source This ensure that the embedded sources are rebuild when a header is changed * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update cmake/cudawrappers-helper.cmake Co-authored-by: Bram Veenboer <[email protected]> --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bram Veenboer <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add HIP version of occupancyMaxActiveBlocksPerMultiprocessor * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: John Romein <[email protected]> Co-authored-by: john-romein <[email protected]> Co-authored-by: Bram Veenboer <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
add the HIP_PLATFORM_AMD__ define so check on that as well for typedefs
I have updated cuda-pcie to support the current state of this branch. This repository serves as an example of how HIP support can be added to a repository that previously only supported CUDA (using cudawrappers). First, add a CMake option to the project:
You set the
Maybe the Getting cudawrappers is the same as before, but for now, we choose the
Let's assume we build a simple example, which is linked to cudawrappers:
The only thing left to do is to make sure that HIP is enabled for
There doesn't seem to be a way to set language properties to targets so inheriting from To me, this is about as user-friendly as it gets. No changes are needed to @loostrum and @john-romein, do you agree that this is the way we want to support HIP compatibility (initially)? |
Very nice overview, I think this is indeed the way to go for now. |
I think it's a good idea to add |
* Add Hanno Spreeuw * Add HIP keyword
Description
Add support for HIP, C++ Heterogeneous-Compute Interface for Portability. From the user's perspective, a few API breaking changes related to the
cu::Context
class had to be made when using cudawrappers in CUDA mode (which is the default). To enable the HIP mode, setCUDAWRAPPERS_BACKEND=HIP
.This PR is still a draft, as some minor things still need to be cleaned up and/or added. Also, the documentation and changelog still needs to be updated. Moreover, we may want to test integration with existing applications in more detail and make changes if needed.
Related issues:
Instructions to review the pull request