Add OpenCL runtime support #24

jpsamaroo · 2019-12-25T02:33:06Z

Abstract runtime functionality by HSA (TODO: OCL)
Use Requires to load OpenCL bindings
Allow choosing runtime via environment variables

Closes #20, closes #23

Abstract runtime functionality by HSA (TODO: OCL) Use Requires to load OpenCL bindings Allow choosing runtime via environment variables

vchuravy · 2019-12-25T09:49:11Z

Whoooo!

jpsamaroo · 2019-12-27T19:45:07Z

So far I've gotten kernels to launch through OpenCL, however they currently segfault on the GPU because (as I understand it) we aren't extracting the correct device-side pointer from cl.Buffer when we convert it to ROCDeviceArray (and then to FakeDeviceArray), so this shouldn't be expected to work right now. I'll probably need to figure out how to allocate buffers from OpenCL that mirror what we do with hsa_memory_alloc in finegrained mode, and then we should be able to extract (somehow) a pointer which works from host or device and pass that in.

Key note for reviewers: we (and LLVM) expect our array arguments to be of type ROCDeviceArray during compilation, so our kernels extract the pointer from that struct to get the actual buffer pointer. OpenCL apparently just passes pointers to raw buffers (like how things are done in C) instead of using nested structs, so we need to trick OpenCL into writing our ROCDeviceArray structs directly into the kernarg buffer. This part is working thanks to some code in OpenCL.jl which automatically handles isbits structs, so it's now on us to ensure that the right device-accessible pointer is embedded into the struct.

jpsamaroo · 2019-12-27T20:15:33Z

Note to self: If we do implement a hacky (slow) workaround to getting the device pointer, we should also provide a shortcut via clSVMAlloc which supposedly does exactly what we do with HSARuntime. This of course requires OpenCL 2.0, but that's reasonable to expect if one wants the best performance.

jpsamaroo · 2019-12-28T14:59:17Z

Now I've got kernels running without segfaults (see the new test/opencl.jl test script), but it appears that the C array never gets written to. If anyone has an idea for why this is happening, I'm all ears!

vchuravy · 2019-12-28T16:35:19Z

If anyone has an idea for why this is happening, I'm all ears!

Do you need to synchronize the memory?

jpsamaroo · 2019-12-28T20:14:52Z

It doesn't seem like that's the issue since we wait on the kernel's event, and even adding in a sync_workgroup() call to the kernel doesn't seem to do anything.

jpsamaroo · 2020-01-06T16:27:24Z

If anyone has a working ROCm debugger setup, it would be great if we could see what instructions the GPU is actually executing (including memory addresses). I suspect we aren't writing to the correct location.

jpsamaroo force-pushed the jps/opencl branch from 35e9965 to d16d782 Compare December 25, 2019 02:38

Add OpenCL runtime support

5b12d9f

Abstract runtime functionality by HSA (TODO: OCL) Use Requires to load OpenCL bindings Allow choosing runtime via environment variables

jpsamaroo force-pushed the jps/opencl branch from d16d782 to 5b12d9f Compare December 25, 2019 03:35

Implement OCL runtime

9472ef8

jpsamaroo added 5 commits December 28, 2019 07:43

Add hacky workaround for getting a cl.Buffer's device pointer

3b4491f

Add test for OpenCL runtime support

58a6bbe

Add OpenCL to test deps

f363a03

Add LinearAlgebra to test deps

ec477a6

Remove silliness in test/opencl.jl kernel

8271f2c

jpsamaroo mentioned this pull request Jan 29, 2020

Abstract runtime support #26

Merged

jpsamaroo changed the title ~~[WIP] Add OpenCL runtime support~~ Add OpenCL runtime support May 12, 2020

jpsamaroo marked this pull request as draft May 12, 2020 12:19

jpsamaroo added enhancement help wanted labels May 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add OpenCL runtime support #24

Add OpenCL runtime support #24

jpsamaroo commented Dec 25, 2019 •

edited

Loading

vchuravy commented Dec 25, 2019

jpsamaroo commented Dec 27, 2019 •

edited

Loading

jpsamaroo commented Dec 27, 2019

jpsamaroo commented Dec 28, 2019

vchuravy commented Dec 28, 2019

jpsamaroo commented Dec 28, 2019

jpsamaroo commented Jan 6, 2020

Add OpenCL runtime support #24

Are you sure you want to change the base?

Add OpenCL runtime support #24

Conversation

jpsamaroo commented Dec 25, 2019 • edited Loading

vchuravy commented Dec 25, 2019

jpsamaroo commented Dec 27, 2019 • edited Loading

jpsamaroo commented Dec 27, 2019

jpsamaroo commented Dec 28, 2019

vchuravy commented Dec 28, 2019

jpsamaroo commented Dec 28, 2019

jpsamaroo commented Jan 6, 2020

jpsamaroo commented Dec 25, 2019 •

edited

Loading

jpsamaroo commented Dec 27, 2019 •

edited

Loading