Additional backends #26

tzanio · 2018-01-03T19:28:33Z

Improve OCCA backend
Add MFEM backend — how to support backends that don’t support JIT and don’t run on the host?
Add MAGMA backend?
Add OpenMP 4.5 backend?
Add pure CUDA backend?
Add HIP backend?

jedbrown · 2019-05-07T19:11:16Z

With the announcement that OLCF Frontier will be AMD CPU/GPU, we should try to get it into our workflow. We can use HIP (an open source CUDA-like model that can compile to CUDA and ROCm) which can be almost automatically produced from CUDA (using hipify-clang) or OpenMP-5 offload as on-node programming models. Note that HIP does not currently support run-time compilation.

HIP nominally compiles to CUDA with negligible overhead, but the toolchain needs to be installed to do so.

tcew · 2019-05-07T19:14:49Z

OCCA:HIP supports run-time compilation.

jeremylt · 2019-05-07T19:16:40Z

Our OCCA backend is in serious need of a performance overhaul, so it would be great if we can also include OCCA:HIP.

tcew · 2019-05-07T19:16:49Z

See: https://github.com/libocca/occa/blob/022b76829d43cbe20b719e6d5a54c9aff8fa178c/src/modes/hip/device.cpp#L230

jedbrown · 2019-05-07T19:23:15Z

Yes, I don't think anything special needs to be done for /gpu/occa/hip versus /gpu/occa/cuda, though the OCCA backend needs attention. My comment on run-time compilation was with regard to @YohannDudouit's native CUDA implementation.

I'm also curious about observed differences in performance characteristics between the Radeon Instinct and V100.

tcew · 2019-05-07T19:26:42Z

You should follow up with Noel Chalmers. I believe he has run libP experiments with the Radeon Instinct.

jedbrown · 2019-05-07T19:31:03Z

Thanks. @noelchalmers, can you share any experiments?

noelchalmers · 2019-05-07T20:05:58Z

Hi everyone. I'll try and chip in what I know for some of the points in this thread:

In addition to hipify-clang, which ports existing CUDA code by actually looking at the code's semantics, there is also hipify-perl which is a simple script which can convert CUDA codes to HIP, and at least warn about sections it is unable to translate.
HIP does indeed support runtime compilation in the same way CUDA does. OCCA uses analogous API calls for its runtime compilation of CUDA and HIP. I know the documentation of what is/is not currently in the HIP API is a bit sparse at the moment. The HIP Porting Guide is a good resource for the moment.
As for V100 vs Radeon Instinct performance, in micro-benchmarking we've been seeing bandwidth numbers in the 800-900 GB/s range for the MI-60s and similar GFLOP numbers to the PCIe V100s.
I don't have any readily available performance numbers for any CEED-relevant benchmarking. My plan is to resurrect the bake-off problems in libp and do some performance analysis to get a better sense of what the Radeons can do compared to the V100s. Libp's kernels rely heavily on things like shared memory bandwidth and cache performance so it will be a good exercise in finding out how portable they are to Radeon.

jedbrown · 2019-05-07T20:12:04Z

Thanks, @noelchalmers.
On run-time compilation, I don't see anything about porting NVRTC to HIP.

Are there any public clouds with Radeon Instinct (for continuous integration, etc.).

noelchalmers · 2019-05-07T20:24:01Z

I just realized that you were referring to NVRTC when you mentioned runtime compilation.

No, HIP currently doesn't support any nvrtc* API calls. I'm not aware of any plans to add these features, but I will ask around. What HIP does support is loading compiled binaries using hipModuleLoad, which is analogous to cuModuleLoad, and finding/launching kernels from that binary.

I don't know of any public clouds I can point to using MI-25 or MI-60s yet. Maybe for some CI tests you could try compiling on some Vegas in a gpueater session? Not ideal, certainly.

jedbrown · 2019-05-07T20:43:49Z

Thanks. It looks like GPU Eater doesn't support docker-machine or Kubernetes so CI integration would be custom and/or not autoscaling, but it's something, so thanks.

jedbrown · 2019-05-23T16:28:53Z

Yet another C++ layer, this one providing single source for CPU, OpenCL, and HIP/CUDA. https://github.com/illuhad/hipSYCL

jedbrown · 2019-09-18T04:38:20Z

While I still don't see it on the docs website, hiprtc was apparently merged a few months ago. ROCm/HIP#1097
I thought we discussed this specifically at CEED3AM and @noelchalmers and Damon were not aware that it existed. Is it something we should be trying now, or is the lack of documentation indication that it's still in easter-egg mode?

jedbrown · 2022-09-06T03:09:25Z

I'll close this open-ended issue. There is an improved occa backend coming in #1043. I think at this point we can make new issues for specific backend requests.

tzanio added backend enhancement labels Jan 3, 2018

tzanio assigned jdahm, dmed256, stomov, v-dobrev and camierjs Jan 3, 2018

jdahm removed their assignment Jun 9, 2020

dmed256 removed their assignment May 25, 2022

jedbrown closed this as completed Sep 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Additional backends #26

Additional backends #26

tzanio commented Jan 3, 2018 •

edited by jeremylt

Loading

jedbrown commented May 7, 2019

tcew commented May 7, 2019

jeremylt commented May 7, 2019

tcew commented May 7, 2019

jedbrown commented May 7, 2019

tcew commented May 7, 2019

jedbrown commented May 7, 2019

noelchalmers commented May 7, 2019

jedbrown commented May 7, 2019

noelchalmers commented May 7, 2019 •

edited

Loading

jedbrown commented May 7, 2019

jedbrown commented May 23, 2019

jedbrown commented Sep 18, 2019

jedbrown commented Sep 6, 2022

Additional backends #26

Additional backends #26

Comments

tzanio commented Jan 3, 2018 • edited by jeremylt Loading

jedbrown commented May 7, 2019

tcew commented May 7, 2019

jeremylt commented May 7, 2019

tcew commented May 7, 2019

jedbrown commented May 7, 2019

tcew commented May 7, 2019

jedbrown commented May 7, 2019

noelchalmers commented May 7, 2019

jedbrown commented May 7, 2019

noelchalmers commented May 7, 2019 • edited Loading

jedbrown commented May 7, 2019

jedbrown commented May 23, 2019

jedbrown commented Sep 18, 2019

jedbrown commented Sep 6, 2022

tzanio commented Jan 3, 2018 •

edited by jeremylt

Loading

noelchalmers commented May 7, 2019 •

edited

Loading