Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bindings for cuPy pointers #4459

Merged
merged 4 commits into from
Feb 13, 2025
Merged

Conversation

anagainaru
Copy link
Contributor

@anagainaru anagainaru commented Feb 8, 2025

Allow python to feed GPU pointers to ADIOS.

Writer side will have to provide the pointer to the cupy array in the call to Put.

Reader side can call Shape(adios2.MemorySpace.GPU) to receive the dimensions correctly for the GPU then will have to SetSelection to this shape and allocate memory on the GPU using these dimensions.

Example in examples/hello/bpStepsWriteReadCuda/bpStepsWriteReadCuda.py

Running the examples:

 $ python examples/hello/bpStepsWriteReadCuda/bpStepsWriteReadCuda.py
Array allocation:  <CUDA Device 0>
Bytes required to store the gpu array 24
Bytes allocated on the device memory pool 512
Bytes used on the device memory pool 512
Blocks allocated on the pinned memory pool (The allocated pinned memory is released just after the transfer is complete) 1
Write to file StepsWriteReadCuPy.bp: (2, 3) data from GPU and (2, 3) data from CPU
Step 0: read GPU data
 [[0. 1. 2.]
 [3. 4. 5.]]
Step 0: read CPU data
 [[0. 1. 2.]
 [3. 4. 5.]]
Step 1: read GPU data
 [[ 0.  2.  4.]
 [ 6.  8. 10.]]
Step 1: read CPU data
 [[1. 2. 3.]
 [4. 5. 6.]]

If we bpls the ADIOS file created, the read is done on the CPU so the dimensions will be flipped for the GPU array:

$ bpls StepsWriteReadCuPy.bp/ -d gpuArray -n 2
  float    gpuArray  2*{3, 2}
    (0,0,0)    0 1
    (0,1,0)    2 3
    (0,2,0)    4 5
    (1,0,0)    0 2
    (1,1,0)    4 6
    (1,2,0)    8 10

@anagainaru anagainaru force-pushed the cupyExample branch 4 times, most recently from 1e4818e to 52b533b Compare February 12, 2025 19:55
@anagainaru anagainaru marked this pull request as ready for review February 12, 2025 19:56
@anagainaru anagainaru requested a review from pnorbert February 12, 2025 19:56
@anagainaru
Copy link
Contributor Author

anagainaru commented Feb 13, 2025

I am leaving this message here so it's not lost when I re-trigger the CI (@eisenhauer).

The asan sanitizer fails with the following error:

99% tests passed, 2 tests failed out of 819
Total Test time (real) = 135.23 sec
The following tests FAILED:
	350 - Remote.BPWriteReadADIOS2stdio.GetRemote (Failed)
	351 - Remote.BPWriteMemorySelectionRead.GetRemote (Failed)

@eisenhauer
Copy link
Member

I am leaving this message here so it's not lost when I re-trigger the CI (@eisenhauer).

The asan sanitizer fails with the following error:

99% tests passed, 2 tests failed out of 819
Total Test time (real) = 135.23 sec
The following tests FAILED:
	350 - Remote.BPWriteReadADIOS2stdio.GetRemote (Failed)
	351 - Remote.BPWriteMemorySelectionRead.GetRemote (Failed)

At this point, I think it's possible that this happens when the server doesn't start up in time during testing. I.E. it likely doesn't represent a problem outside the test scenario. I'll submit a PR shortly that makes clients notice the connection failure, wait for a second, then retry the connection. With luck, it'll take care of this heisen-bug.

@anagainaru anagainaru merged commit bf3fee2 into ornladios:master Feb 13, 2025
39 checks passed
@anagainaru anagainaru deleted the cupyExample branch February 13, 2025 20:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants