Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Threaded Cupy Warmup/Initialization Error #84

Open
wlruys opened this issue Aug 28, 2021 · 0 comments
Open

Threaded Cupy Warmup/Initialization Error #84

wlruys opened this issue Aug 28, 2021 · 0 comments

Comments

@wlruys
Copy link
Contributor

wlruys commented Aug 28, 2021

When running w/ cupy 9.3.0 and cudatoolkit 11.2.2 runs I occasionally see the following error when running Parla on the TSQR demo app.

It could be something wrong with my env, but logging it here. While its a rare error for each Parla instance, it happens fairly often on larger MPI runs.

Unexpected exception in Task handling Traceback (most recent call last): File ".../miniconda3/lib/python3.8/site-packages/parla/task_runtime.py", line 515, in run component.initialize_thread() File ".../miniconda3/lib/python3.8/site-packages/parla/cuda.py", line 250, in initialize_thread cupy.asnumpy(cupy.sqrt(a)) File ".../miniconda3/lib/python3.8/site-packages/cupy/__init__.py", line 773, in asnumpy return a.get(stream=stream, order=order) File "cupy/_core/core.pyx", line 1567, in cupy._core.core.ndarray.get File "cupy/_core/core.pyx", line 1636, in cupy._core.core.ndarray.get File "cupy/_core/core.pyx", line 1644, in cupy._core.core.ndarray.get File "cupy/cuda/memory.pyx", line 551, in cupy.cuda.memory.MemoryPointer.copy_to_host_async File "cupy_backends/cuda/api/runtime.pyx", line 693, in cupy_backends.cuda.api.runtime.memcpyAsync File "cupy_backends/cuda/api/runtime.pyx", line 273, in cupy_backends.cuda.api.runtime.check_status cupy_backends.cuda.api.runtime.CUDARuntimeError: cudaErrorInvalidValue: invalid argument Unexpected exception in Task handling

I don't see how 'a' would fail to exist after a sync but the gpu->cpu copy is failing.

@wlruys wlruys changed the title Threaded Cupy Warmpu/Initialization Error Threaded Cupy Warmup/Initialization Error Aug 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant