Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build error with CUDA 11.2.1 #366

Open
lahwaacz opened this issue Mar 19, 2021 · 2 comments
Open

Build error with CUDA 11.2.1 #366

lahwaacz opened this issue Mar 19, 2021 · 2 comments

Comments

@lahwaacz
Copy link

The build fails with CUDA 11.2.1 (on Arch Linux). nvcc_wrapper is available in /usr/bin/ as part of the trilinos package (I'm not building with Kokkos which is included in Trilinos too).

CMake output:

$ mkdir build
$ cd build
$ cmake .. -DOmega_h_USE_CUDA=on -DCMAKE_CXX_COMPILER=nvcc_wrapper
-- The CXX compiler identification is GNU 10.2.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/nvcc_wrapper - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- CMAKE_VERSION: 3.19.7
-- Omega_h_VERSION: 9.33.2
-- USE_XSDK_DEFAULTS: OFF
-- BUILD_TESTING: OFF
-- BUILD_SHARED_LIBS: ON
-- CMAKE_INSTALL_PREFIX: /usr/local
-- Omega_h_CHECK_BOUNDS: OFF
-- Omega_h_THROW: OFF
-- Omega_h_DATA: 
-- Omega_h_USE_EGADS: OFF
-- EGADS_PREFIX: 
-- Omega_h_USE_Kokkos: OFF
-- Kokkos_PREFIX: 
-- Omega_h_USE_CUDA_AWARE_MPI: OFF
-- Omega_h_VALGRIND: 
-- Omega_h_EXAMPLES: OFF
-- Omega_h_USE_MPI: OFF
-- Omega_h_USE_ZLIB: ON
-- ZLIB_PREFIX: 
-- Found ZLIB: /usr/lib/libz.so (found version "1.2.11") 
-- Omega_h_USE_Kokkos: OFF
-- Omega_h_USE_libMeshb: OFF
-- Omega_h_USE_Gmsh: OFF
-- Omega_h_USE_Gmodel: OFF
-- Omega_h_USE_SEACASExodus: OFF
-- Omega_h_USE_pybind11: OFF
-- Omega_h_USE_OpenMP: OFF
-- Omega_h_USE_CUDA: on
-- The CUDA compiler identification is NVIDIA 11.2.142
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /opt/cuda/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Omega_h_USE_DOLFIN: OFF
-- Omega_h_SEMVER = 9.33.2-sha.91330909+000110000000001
-- Configuring done
-- Generating done
-- Build files have been written to: /home/lahwaacz/Bbox/pg/cpp/3rd party/omega_h/build

Build output:

$ make
...
$ make -j1
[  1%] Building CUDA object src/CMakeFiles/omega_h.dir/Omega_h_int_scan.cpp.o
/opt/cuda/bin/../targets/x86_64-linux/include/thrust/system/cuda/detail/scan.h(578): error: array of reference is not allowed
          detected during:
            instantiation of class "thrust::cuda_cub::__scan::DoNothing<T> [with T=const Omega_h::LO &]" 
(784): here
            instantiation of "OutputIt thrust::cuda_cub::inclusive_scan_n(thrust::cuda_cub::execution_policy<Derived> &, InputIt, Size, OutputIt, ScanOp) [with Derived=thrust::cuda_cub::par_t, InputIt=thrust::cuda_cub::transform_input_iterator_t<const Omega_h::LO &, Omega_h::LO *, thrust::identity<Omega_h::LO>>, Size=std::ptrdiff_t, OutputIt=Omega_h::LO *, ScanOp=thrust::maximum<Omega_h::LO>]" 
/opt/cuda/bin/../targets/x86_64-linux/include/thrust/system/cuda/detail/transform_scan.h(72): here
            instantiation of "OutputIt thrust::cuda_cub::transform_inclusive_scan(thrust::cuda_cub::execution_policy<Derived> &, InputIt, InputIt, OutputIt, TransformOp, ScanOp) [with Derived=thrust::cuda_cub::par_t, InputIt=Omega_h::LO *, OutputIt=Omega_h::LO *, TransformOp=thrust::identity<Omega_h::LO>, ScanOp=thrust::maximum<Omega_h::LO>]" 
/opt/cuda/bin/../targets/x86_64-linux/include/thrust/detail/transform_scan.inl(47): here
            instantiation of "OutputIterator thrust::transform_inclusive_scan(const thrust::detail::execution_policy_base<DerivedPolicy> &, InputIterator, InputIterator, OutputIterator, UnaryFunction, AssociativeOperator) [with DerivedPolicy=thrust::cuda_cub::par_t, InputIterator=Omega_h::LO *, OutputIterator=Omega_h::LO *, UnaryFunction=thrust::identity<Omega_h::LO>, AssociativeOperator=thrust::maximum<Omega_h::LO>]" 
/home/lahwaacz/Bbox/pg/cpp/3rd party/omega_h/src/Omega_h_scan.hpp(84): here
            instantiation of "OutputIterator Omega_h::transform_inclusive_scan(InputIterator, InputIterator, OutputIterator, BinaryOp, UnaryOp) [with InputIterator=Omega_h::LO *, OutputIterator=Omega_h::LO *, BinaryOp=Omega_h::maximum<Omega_h::LO>, UnaryOp=Omega_h::identity<Omega_h::LO>]" 
/home/lahwaacz/Bbox/pg/cpp/3rd party/omega_h/src/Omega_h_int_scan.cpp(32): here

/opt/cuda/bin/../targets/x86_64-linux/include/cub/block/block_load.cuh(974): error: array of reference is not allowed
          detected during:
            instantiation of class "cub::BlockLoad<InputT, BLOCK_DIM_X, ITEMS_PER_THREAD, ALGORITHM, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::LoadInternal<cub::BLOCK_LOAD_WARP_TRANSPOSE_TIMESLICED, DUMMY> [with InputT=const Omega_h::LO &, BLOCK_DIM_X=128, ITEMS_PER_THREAD=12, ALGORITHM=cub::BLOCK_LOAD_WARP_TRANSPOSE_TIMESLICED, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=520, DUMMY=0]" 
(1015): here
            instantiation of class "cub::BlockLoad<InputT, BLOCK_DIM_X, ITEMS_PER_THREAD, ALGORITHM, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH> [with InputT=const Omega_h::LO &, BLOCK_DIM_X=128, ITEMS_PER_THREAD=12, ALGORITHM=cub::BLOCK_LOAD_WARP_TRANSPOSE_TIMESLICED, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=520]" 
/opt/cuda/bin/../targets/x86_64-linux/include/thrust/system/cuda/detail/scan.h(264): here
            instantiation of union "thrust::cuda_cub::__scan::ScanAgent<InputIt, OutputIt, ScanOp, Size, T, Inclusive>::PtxPlan<Arch>::TempStorage [with InputIt=thrust::cuda_cub::transform_input_iterator_t<const Omega_h::LO &, Omega_h::LO *, thrust::identity<Omega_h::LO>>, OutputIt=Omega_h::LO *, ScanOp=thrust::maximum<Omega_h::LO>, Size=thrust::detail::int32_t, T=const Omega_h::LO &, Inclusive=thrust::detail::true_type, Arch=thrust::cuda_cub::core::sm52]" 
/opt/cuda/bin/../targets/x86_64-linux/include/thrust/system/cuda/detail/core/util.h(202): here
            instantiation of class "thrust::cuda_cub::core::temp_storage_size_impl<Agent, thrust::detail::true_type> [with Agent=thrust::cuda_cub::core::specialize_plan<thrust::cuda_cub::__scan::ScanAgent<thrust::cuda_cub::transform_input_iterator_t<const Omega_h::LO &, Omega_h::LO *, thrust::identity<Omega_h::LO>>, Omega_h::LO *, thrust::maximum<Omega_h::LO>, thrust::detail::int32_t, const Omega_h::LO &, thrust::detail::true_type>::PtxPlan, thrust::cuda_cub::core::sm60>]" 
/opt/cuda/bin/../targets/x86_64-linux/include/thrust/system/cuda/detail/core/util.h(207): here
            instantiation of class "thrust::cuda_cub::core::temp_storage_size<Agent> [with Agent=thrust::cuda_cub::core::specialize_plan<thrust::cuda_cub::__scan::ScanAgent<thrust::cuda_cub::transform_input_iterator_t<const Omega_h::LO &, Omega_h::LO *, thrust::identity<Omega_h::LO>>, Omega_h::LO *, thrust::maximum<Omega_h::LO>, thrust::detail::int32_t, const Omega_h::LO &, thrust::detail::true_type>::PtxPlan, thrust::cuda_cub::core::sm60>]" 
/opt/cuda/bin/../targets/x86_64-linux/include/thrust/system/cuda/detail/core/util.h(225): here
            [ 4 instantiation contexts not shown ]
            instantiation of "OutputIt thrust::cuda_cub::__scan::scan<Inclusive,Derived,InputIt,OutputIt,Size,ScanOp,AddInitToExclusiveScan>(thrust::cuda_cub::execution_policy<Derived> &, InputIt, OutputIt, Size, ScanOp, AddInitToExclusiveScan) [with Inclusive=thrust::detail::true_type, Derived=thrust::cuda_cub::par_t, InputIt=thrust::cuda_cub::transform_input_iterator_t<const Omega_h::LO &, Omega_h::LO *, thrust::identity<Omega_h::LO>>, OutputIt=Omega_h::LO *, Size=std::ptrdiff_t, ScanOp=thrust::maximum<Omega_h::LO>, AddInitToExclusiveScan=thrust::cuda_cub::__scan::DoNothing<const Omega_h::LO &>]" 
/opt/cuda/bin/../targets/x86_64-linux/include/thrust/system/cuda/detail/scan.h(784): here
            instantiation of "OutputIt thrust::cuda_cub::inclusive_scan_n(thrust::cuda_cub::execution_policy<Derived> &, InputIt, Size, OutputIt, ScanOp) [with Derived=thrust::cuda_cub::par_t, InputIt=thrust::cuda_cub::transform_input_iterator_t<const Omega_h::LO &, Omega_h::LO *, thrust::identity<Omega_h::LO>>, Size=std::ptrdiff_t, OutputIt=Omega_h::LO *, ScanOp=thrust::maximum<Omega_h::LO>]" 
/opt/cuda/bin/../targets/x86_64-linux/include/thrust/system/cuda/detail/transform_scan.h(72): here
            instantiation of "OutputIt thrust::cuda_cub::transform_inclusive_scan(thrust::cuda_cub::execution_policy<Derived> &, InputIt, InputIt, OutputIt, TransformOp, ScanOp) [with Derived=thrust::cuda_cub::par_t, InputIt=Omega_h::LO *, OutputIt=Omega_h::LO *, TransformOp=thrust::identity<Omega_h::LO>, ScanOp=thrust::maximum<Omega_h::LO>]" 
/opt/cuda/bin/../targets/x86_64-linux/include/thrust/detail/transform_scan.inl(47): here
            instantiation of "OutputIterator thrust::transform_inclusive_scan(const thrust::detail::execution_policy_base<DerivedPolicy> &, InputIterator, InputIterator, OutputIterator, UnaryFunction, AssociativeOperator) [with DerivedPolicy=thrust::cuda_cub::par_t, InputIterator=Omega_h::LO *, OutputIterator=Omega_h::LO *, UnaryFunction=thrust::identity<Omega_h::LO>, AssociativeOperator=thrust::maximum<Omega_h::LO>]" 
/home/lahwaacz/Bbox/pg/cpp/3rd party/omega_h/src/Omega_h_scan.hpp(84): here
            instantiation of "OutputIterator Omega_h::transform_inclusive_scan(InputIterator, InputIterator, OutputIterator, BinaryOp, UnaryOp) [with InputIterator=Omega_h::LO *, OutputIterator=Omega_h::LO *, BinaryOp=Omega_h::maximum<Omega_h::LO>, UnaryOp=Omega_h::identity<Omega_h::LO>]" 
/home/lahwaacz/Bbox/pg/cpp/3rd party/omega_h/src/Omega_h_int_scan.cpp(32): here

/opt/cuda/bin/../targets/x86_64-linux/include/cub/block/block_load.cuh(984): error: array of reference is not allowed
          detected during:
            instantiation of class "cub::BlockLoad<InputT, BLOCK_DIM_X, ITEMS_PER_THREAD, ALGORITHM, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::LoadInternal<cub::BLOCK_LOAD_WARP_TRANSPOSE_TIMESLICED, DUMMY> [with InputT=const Omega_h::LO &, BLOCK_DIM_X=128, ITEMS_PER_THREAD=12, ALGORITHM=cub::BLOCK_LOAD_WARP_TRANSPOSE_TIMESLICED, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=520, DUMMY=0]" 
(1015): here
            instantiation of class "cub::BlockLoad<InputT, BLOCK_DIM_X, ITEMS_PER_THREAD, ALGORITHM, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH> [with InputT=const Omega_h::LO &, BLOCK_DIM_X=128, ITEMS_PER_THREAD=12, ALGORITHM=cub::BLOCK_LOAD_WARP_TRANSPOSE_TIMESLICED, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=520]" 
/opt/cuda/bin/../targets/x86_64-linux/include/thrust/system/cuda/detail/scan.h(264): here
            instantiation of union "thrust::cuda_cub::__scan::ScanAgent<InputIt, OutputIt, ScanOp, Size, T, Inclusive>::PtxPlan<Arch>::TempStorage [with InputIt=thrust::cuda_cub::transform_input_iterator_t<const Omega_h::LO &, Omega_h::LO *, thrust::identity<Omega_h::LO>>, OutputIt=Omega_h::LO *, ScanOp=thrust::maximum<Omega_h::LO>, Size=thrust::detail::int32_t, T=const Omega_h::LO &, Inclusive=thrust::detail::true_type, Arch=thrust::cuda_cub::core::sm52]" 
/opt/cuda/bin/../targets/x86_64-linux/include/thrust/system/cuda/detail/core/util.h(202): here
            instantiation of class "thrust::cuda_cub::core::temp_storage_size_impl<Agent, thrust::detail::true_type> [with Agent=thrust::cuda_cub::core::specialize_plan<thrust::cuda_cub::__scan::ScanAgent<thrust::cuda_cub::transform_input_iterator_t<const Omega_h::LO &, Omega_h::LO *, thrust::identity<Omega_h::LO>>, Omega_h::LO *, thrust::maximum<Omega_h::LO>, thrust::detail::int32_t, const Omega_h::LO &, thrust::detail::true_type>::PtxPlan, thrust::cuda_cub::core::sm60>]" 
/opt/cuda/bin/../targets/x86_64-linux/include/thrust/system/cuda/detail/core/util.h(207): here
            instantiation of class "thrust::cuda_cub::core::temp_storage_size<Agent> [with Agent=thrust::cuda_cub::core::specialize_plan<thrust::cuda_cub::__scan::ScanAgent<thrust::cuda_cub::transform_input_iterator_t<const Omega_h::LO &, Omega_h::LO *, thrust::identity<Omega_h::LO>>, Omega_h::LO *, thrust::maximum<Omega_h::LO>, thrust::detail::int32_t, const Omega_h::LO &, thrust::detail::true_type>::PtxPlan, thrust::cuda_cub::core::sm60>]" 
/opt/cuda/bin/../targets/x86_64-linux/include/thrust/system/cuda/detail/core/util.h(225): here
            [ 4 instantiation contexts not shown ]
            instantiation of "OutputIt thrust::cuda_cub::__scan::scan<Inclusive,Derived,InputIt,OutputIt,Size,ScanOp,AddInitToExclusiveScan>(thrust::cuda_cub::execution_policy<Derived> &, InputIt, OutputIt, Size, ScanOp, AddInitToExclusiveScan) [with Inclusive=thrust::detail::true_type, Derived=thrust::cuda_cub::par_t, InputIt=thrust::cuda_cub::transform_input_iterator_t<const Omega_h::LO &, Omega_h::LO *, thrust::identity<Omega_h::LO>>, OutputIt=Omega_h::LO *, Size=std::ptrdiff_t, ScanOp=thrust::maximum<Omega_h::LO>, AddInitToExclusiveScan=thrust::cuda_cub::__scan::DoNothing<const Omega_h::LO &>]" 
/opt/cuda/bin/../targets/x86_64-linux/include/thrust/system/cuda/detail/scan.h(784): here
            instantiation of "OutputIt thrust::cuda_cub::inclusive_scan_n(thrust::cuda_cub::execution_policy<Derived> &, InputIt, Size, OutputIt, ScanOp) [with Derived=thrust::cuda_cub::par_t, InputIt=thrust::cuda_cub::transform_input_iterator_t<const Omega_h::LO &, Omega_h::LO *, thrust::identity<Omega_h::LO>>, Size=std::ptrdiff_t, OutputIt=Omega_h::LO *, ScanOp=thrust::maximum<Omega_h::LO>]" 
/opt/cuda/bin/../targets/x86_64-linux/include/thrust/system/cuda/detail/transform_scan.h(72): here
            instantiation of "OutputIt thrust::cuda_cub::transform_inclusive_scan(thrust::cuda_cub::execution_policy<Derived> &, InputIt, InputIt, OutputIt, TransformOp, ScanOp) [with Derived=thrust::cuda_cub::par_t, InputIt=Omega_h::LO *, OutputIt=Omega_h::LO *, TransformOp=thrust::identity<Omega_h::LO>, ScanOp=thrust::maximum<Omega_h::LO>]" 
/opt/cuda/bin/../targets/x86_64-linux/include/thrust/detail/transform_scan.inl(47): here
            instantiation of "OutputIterator thrust::transform_inclusive_scan(const thrust::detail::execution_policy_base<DerivedPolicy> &, InputIterator, InputIterator, OutputIterator, UnaryFunction, AssociativeOperator) [with DerivedPolicy=thrust::cuda_cub::par_t, InputIterator=Omega_h::LO *, OutputIterator=Omega_h::LO *, UnaryFunction=thrust::identity<Omega_h::LO>, AssociativeOperator=thrust::maximum<Omega_h::LO>]" 
/home/lahwaacz/Bbox/pg/cpp/3rd party/omega_h/src/Omega_h_scan.hpp(84): here
            instantiation of "OutputIterator Omega_h::transform_inclusive_scan(InputIterator, InputIterator, OutputIterator, BinaryOp, UnaryOp) [with InputIterator=Omega_h::LO *, OutputIterator=Omega_h::LO *, BinaryOp=Omega_h::maximum<Omega_h::LO>, UnaryOp=Omega_h::identity<Omega_h::LO>]" 
/home/lahwaacz/Bbox/pg/cpp/3rd party/omega_h/src/Omega_h_int_scan.cpp(32): here

The full output is much longer, so I've copy-pasted just the first 3 actual errors.

@cwsmith
Copy link
Contributor

cwsmith commented Jun 4, 2021

I too created an issue for this: SCOREC#14

@ibaned Reported this to the Thrust team and it is supposed to be fixed in 11.3.

@ibaned
Copy link
Collaborator

ibaned commented Jun 4, 2021

Fundamentally, CUDA 11.2 is unusable for us. Please use either earlier or newer versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants