You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"test.cpp", line 22: error: global or namespace scope variables such as "Wrapper<T>::my_count [with T=int32_t]" (declared at line 26) cannot be accessed from device code
function "Wrapper<T>::~Wrapper [with T=int32_t]" is implicitly a device function because it is called from device function "cuda::std::__4::__tuple_leaf<_Ip, _Hp, cuda::std::__4::__tuple_leaf_specialization::__default>::~__tuple_leaf [with _Ip=0UL, _Hp=Wrapper<int32_t>]" (declared implicitly)
--my_count;
^
This is a new failure that's being observed after this commit 35df3a9.
The mismatch algorithm should only inspect existing elements of a container, not create new elements. In this test, since Wrapper has a constructor/destructor accessing static storage object, it gets created on the device after this commitL and causes NVC++ stdpar to fail.
It looks like the recent change in this header file mismatch.h in the commit 35df3a9 could be potentially causing the creation of a cuda::std::tuple object, where one of the types inside the tuple is Wrapper<int32_t> in the smaller test case.
How to Reproduce
nvc++ -stdpar -Ofast --c++17 -c test.cpp
Expected behavior
"test.cpp", line 22: error: global or namespace scope variables such as "Wrapper::my_count [with T=int32_t]" (declared at line 26) cannot be accessed from device code
function "Wrapper::Wrapper [with T=int32_t]" is implicitly a device function because it is called from device function "cuda::std::__4::__tuple_leaf<_Ip, _Hp, cuda::std::__4::__tuple_leaf_specialization::__default>::__tuple_leaf [with _Ip=0UL, _Hp=Wrapper<int32_t>]" (declared implicitly)
--my_count;
^
Reproduction link
No response
Operating System
No response
nvidia-smi output
No response
NVCC version
No response
The text was updated successfully, but these errors were encountered:
I can reproduce the issue in a slightly different form (calling thrust::mismatch and compiling with nvcc), but the issue persists even if I revert 35df3a9 :/ In order to get this solved we need a larger re-engineering of thrust iterators to never pull a copy or construct a temporary anywhere.
I'm using nvc++ dev version, that tracks the CCCL main branch. In contrast, nvc++ release versions use CCCL release branches, so they don’t include this commit.
Is this a duplicate?
Type of Bug
Compile-time Error
Component
Thrust
Describe the bug
For this example:
The NVC++ stdpar fails with:
This is a new failure that's being observed after this commit 35df3a9.
The mismatch algorithm should only inspect existing elements of a container, not create new elements. In this test, since Wrapper has a constructor/destructor accessing static storage object, it gets created on the device after this commitL and causes NVC++ stdpar to fail.
It looks like the recent change in this header file mismatch.h in the commit 35df3a9 could be potentially causing the creation of a cuda::std::tuple object, where one of the types inside the tuple is Wrapper<int32_t> in the smaller test case.
How to Reproduce
nvc++ -stdpar -Ofast --c++17 -c test.cpp
Expected behavior
"test.cpp", line 22: error: global or namespace scope variables such as "Wrapper::my_count [with T=int32_t]" (declared at line 26) cannot be accessed from device code
function "Wrapper::
Wrapper [with T=int32_t]" is implicitly a device function because it is called from device function "cuda::std::__4::__tuple_leaf<_Ip, _Hp, cuda::std::__4::__tuple_leaf_specialization::__default>::__tuple_leaf [with _Ip=0UL, _Hp=Wrapper<int32_t>]" (declared implicitly)--my_count;
^
Reproduction link
No response
Operating System
No response
nvidia-smi output
No response
NVCC version
No response
The text was updated successfully, but these errors were encountered: