This repository was archived by the owner on May 27, 2021. It is now read-only.
This repository was archived by the owner on May 27, 2021. It is now read-only.
Can't access GPUs, get "ERROR: CUDA error: invalid device context (code 201, ERROR_INVALID_CONTEXT)" #620
Closed
Description
Hi
I just got access to a nice machine with plenty of GPUs but they don't seem to be available for Julia:
$ nvidia-smi
Fri Apr 3 14:44:44 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.64 Driver Version: 440.64 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 208... On | 00000000:04:00.0 Off | N/A |
| 27% 24C P8 1W / 250W | 1108MiB / 11019MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce RTX 208... On | 00000000:05:00.0 Off | N/A |
| 27% 24C P8 21W / 250W | 11MiB / 11019MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 GeForce RTX 208... On | 00000000:06:00.0 Off | N/A |
| 27% 24C P8 20W / 250W | 11MiB / 11019MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 GeForce RTX 208... On | 00000000:07:00.0 Off | N/A |
| 27% 25C P8 1W / 250W | 11MiB / 11019MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 4 GeForce RTX 208... On | 00000000:08:00.0 Off | N/A |
| 27% 24C P8 20W / 250W | 11MiB / 11019MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 5 GeForce RTX 208... On | 00000000:0B:00.0 Off | N/A |
| 27% 25C P8 19W / 250W | 11MiB / 11019MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 6 GeForce RTX 208... On | 00000000:0C:00.0 Off | N/A |
| 27% 25C P8 19W / 250W | 11MiB / 11019MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 7 GeForce RTX 208... On | 00000000:0D:00.0 Off | N/A |
| 27% 23C P8 18W / 250W | 11MiB / 11019MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 8 GeForce RTX 208... On | 00000000:0E:00.0 Off | N/A |
| 27% 26C P8 21W / 250W | 11MiB / 11019MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 9 GeForce RTX 208... On | 00000000:0F:00.0 Off | N/A |
| 27% 25C P8 1W / 250W | 11MiB / 11019MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 19629 C /opt/conda/bin/python 787MiB |
| 0 26698 C ...geHD/userHome/rmz/julia-1.4.0/bin/julia 310MiB |
+-----------------------------------------------------------------------------+
... so there should be plenty of hardware available. Having read a few other error reports about similar issues, I also tested this:
apt list | grep -i cupti
WARNING: apt does not have a stable CLI interface. Use with caution in scripts.
libcupti-dev/bionic 9.1.85-3ubuntu1 amd64
libcupti-doc/bionic 9.1.85-3ubuntu1 all
libcupti9.1/bionic 9.1.85-3ubuntu1 amd64
... but back to the main story and the error messages:
$ julia
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.4.0 (2020-03-21)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
julia> using CuArrays
┌ Warning: Incompatibility detected between CUDA and LLVM 8.0+; disabling debug info emission for CUDA kernels
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/CUDAnative.jl:114
[ Info: CUDAnative.jl failed to initialize, GPU functionality unavailable (set JULIA_CUDA_SILENT or JULIA_CUDA_VERBOSE to silence or expand this message)
julia> cu([1,2,3])
ERROR: CUDA error: invalid device context (code 201, ERROR_INVALID_CONTEXT)
Stacktrace:
[1] throw_api_error(::CUDAdrv.cudaError_enum) at /storageHD/userHome/rmz/.julia/packages/CUDAdrv/b1mvw/src/error.jl:131
[2] macro expansion at /storageHD/userHome/rmz/.julia/packages/CUDAdrv/b1mvw/src/error.jl:144 [inlined]
[3] cuMemAlloc_v2 at /storageHD/userHome/rmz/.julia/packages/CUDAdrv/b1mvw/src/libcuda.jl:313 [inlined]
[4] alloc(::Type{CUDAdrv.Mem.DeviceBuffer}, ::Int32) at /storageHD/userHome/rmz/.julia/packages/CUDAdrv/b1mvw/src/memory.jl:70
[5] macro expansion at /storageHD/userHome/rmz/.julia/packages/TimerOutputs/7Id5J/src/TimerOutput.jl:228 [inlined]
[6] macro expansion at /storageHD/userHome/rmz/.julia/packages/CuArrays/A6GUx/src/memory.jl:61 [inlined]
[7] macro expansion at ./util.jl:234 [inlined]
[8] actual_alloc(::Int32) at /storageHD/userHome/rmz/.julia/packages/CuArrays/A6GUx/src/memory.jl:60
[9] actual_alloc at /storageHD/userHome/rmz/.julia/packages/CuArrays/A6GUx/src/memory/binned.jl:55 [inlined]
[10] macro expansion at /storageHD/userHome/rmz/.julia/packages/CuArrays/A6GUx/src/memory/binned.jl:198 [inlined]
[11] macro expansion at /storageHD/userHome/rmz/.julia/packages/TimerOutputs/7Id5J/src/TimerOutput.jl:228 [inlined]
[12] pool_alloc(::Int32, ::Int32) at /storageHD/userHome/rmz/.julia/packages/CuArrays/A6GUx/src/memory/binned.jl:197
[13] (::CuArrays.BinnedPool.var"#12#13"{Int32,Int32,Set{CuArrays.BinnedPool.Block},Array{CuArrays.BinnedPool.Block,1}})() at /storageHD/userHome/rmz/.julia/packages/CuArrays/A6GUx/src/memory/binned.jl:293
[14] lock(::CuArrays.BinnedPool.var"#12#13"{Int32,Int32,Set{CuArrays.BinnedPool.Block},Array{CuArrays.BinnedPool.Block,1}}, ::ReentrantLock) at ./lock.jl:161
[15] alloc(::Int32) at /storageHD/userHome/rmz/.julia/packages/CuArrays/A6GUx/src/memory/binned.jl:292
[16] macro expansion at /storageHD/userHome/rmz/.julia/packages/TimerOutputs/7Id5J/src/TimerOutput.jl:228 [inlined]
[17] macro expansion at /storageHD/userHome/rmz/.julia/packages/CuArrays/A6GUx/src/memory.jl:159 [inlined]
[18] macro expansion at ./util.jl:234 [inlined]
[19] alloc at /storageHD/userHome/rmz/.julia/packages/CuArrays/A6GUx/src/memory.jl:158 [inlined]
[20] CuArray{Float32,1,P} where P(::UndefInitializer, ::Tuple{Int32}) at /storageHD/userHome/rmz/.julia/packages/CuArrays/A6GUx/src/array.jl:92
[21] CuArray at /storageHD/userHome/rmz/.julia/packages/CuArrays/A6GUx/src/array.jl:100 [inlined]
[22] similar at ./abstractarray.jl:671 [inlined]
[23] convert at /storageHD/userHome/rmz/.julia/packages/GPUArrays/1wgPO/src/construction.jl:80 [inlined]
[24] adapt_storage at /storageHD/userHome/rmz/.julia/packages/CuArrays/A6GUx/src/array.jl:239 [inlined]
[25] adapt_structure at /storageHD/userHome/rmz/.julia/packages/Adapt/m5jFF/src/Adapt.jl:9 [inlined]
[26] adapt at /storageHD/userHome/rmz/.julia/packages/Adapt/m5jFF/src/Adapt.jl:6 [inlined]
[27] cu(::Array{Int32,1}) at /storageHD/userHome/rmz/.julia/packages/CuArrays/A6GUx/src/array.jl:314
[28] top-level scope at REPL[2]:1
julia> using CUDAdrv; CUDAdrv.CuDevice(0)
CuDevice(0): GeForce RTX 2080 Ti
Following the advice to set the JULA_CUDA_VERBOSE flag, I get this result:
$ JULIA_CUDA_VERBOSE=true julia
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.4.0 (2020-03-21)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
julia> using CuArrays
┌ Warning: Incompatibility detected between CUDA and LLVM 8.0+; disabling debug info emission for CUDA kernels
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/CUDAnative.jl:114
┌ Error: CUDAnative.jl failed to initialize
│ exception =
│ Your CUDA installation does not provide libcudadevrt
│ Stacktrace:
│ [1] error(::String) at ./error.jl:33
│ [2] __init__() at /storageHD/userHome/rmz/.julia/packages/CUDAnative/hfulr/src/CUDAnative.jl:146
│ [3] _include_from_serialized(::String, ::Array{Any,1}) at ./loading.jl:697
│ [4] _require_search_from_serialized(::Base.PkgId, ::String) at ./loading.jl:781
│ [5] _tryrequire_from_serialized(::Base.PkgId, ::UInt64, ::String) at ./loading.jl:712
│ [6] _require_search_from_serialized(::Base.PkgId, ::String) at ./loading.jl:770
│ [7] _require(::Base.PkgId) at ./loading.jl:1006
│ [8] require(::Base.PkgId) at ./loading.jl:927
│ [9] require(::Module, ::Symbol) at ./loading.jl:922
│ [10] eval(::Module, ::Any) at ./boot.jl:331
│ [11] eval_user_input(::Any, ::REPL.REPLBackend) at /buildworker/worker/package_linux32/build/usr/share/julia/stdlib/v1.4/REPL/src/REPL.jl:86
│ [12] macro expansion at /buildworker/worker/package_linux32/build/usr/share/julia/stdlib/v1.4/REPL/src/REPL.jl:118 [inlined]
│ [13] (::REPL.var"#26#27"{REPL.REPLBackend})() at ./task.jl:358
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/CUDAnative.jl:190
┌ Warning: CuArrays.jl did not initialize because CUDAdrv.jl or CUDAnative.jl failed to
└ @ CuArrays ~/.julia/packages/CuArrays/A6GUx/src/CuArrays.jl:64
julia>
... do you have any suggestions about what I should do next? It seems like the text:
exception =
│ Your CUDA installation does not provide libcudadevrt
... is at the crux of the problem, but I don't know to amend. Do you have any suggestions?
Metadata
Metadata
Assignees
Labels
No labels