Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compatibility issue between the PyTorch version, CUDA version, and possibly the GCC version #120

Open
PavelApis opened this issue Oct 25, 2024 · 0 comments

Comments

@PavelApis
Copy link

PavelApis commented Oct 25, 2024

Hi!

Will be very grateful for any help!

I'm having a kind of error when CUDA kernel fails to compile with the nvcc. It looks like there’s an incompatibility between PyTorch, CUDA and GCC versions.

I've got:
nvcc: 11.5, V11.5.119
gcc: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
torch: 1.12.0+cu116
python: 3.10.15
ninja: 1.10.2

Due to I'm running this on a server I don't have root permissions and I can't downgrade nvcc version or install python 3.10-dev.

torch: 1.12.0+cu113 or gcc9/10 doesn't work either.

File ~/my_project/notebook/../models/psp/stylegan2/op/fused_act.py:8
      6 from torch.utils.cpp_extension import load
      7 module_path = os.path.dirname(__file__)
----> 8 fused = load(
      9     "fused",
     10     sources=[
     11         os.path.join(module_path, "fused_bias_act.cpp"),
     12         os.path.join(module_path, "fused_bias_act_kernel.cu"),
     13     ],
     14 )
     16 class FusedLeakyReLUFunctionBackward(Function):
     17     @staticmethod
     18     def forward(ctx, grad_output, out, negative_slope, scale):

File ~/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py:1202, in load(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module, is_standalone, keep_intermediates)
   1111 def load(name,
   1112          sources: Union[str, List[str]],
   1113          extra_cflags=None,
   (...)
   1121          is_standalone=False,
   1122          keep_intermediates=True):
   1123     r'''
   1124     Loads a PyTorch C++ extension just-in-time (JIT).
   1125 
   (...)
   1200                 verbose=True)
   1201     '''
-> 1202     return _jit_compile(
   1203         name,
   1204         [sources] if isinstance(sources, str) else sources,
   1205         extra_cflags,
   1206         extra_cuda_cflags,
   1207         extra_ldflags,
   1208         extra_include_paths,
   1209         build_directory or _get_build_directory(name, verbose),
   1210         verbose,
   1211         with_cuda,
   1212         is_python_module,
   1213         is_standalone,
   1214         keep_intermediates=keep_intermediates)

File ~/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py:1425, in _jit_compile(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module, is_standalone, keep_intermediates)
   1421                 hipified_sources.add(hipify_result[s_abs]["hipified_path"] if s_abs in hipify_result else s_abs)
   1423             sources = list(hipified_sources)
-> 1425         _write_ninja_file_and_build_library(
   1426             name=name,
   1427             sources=sources,
   1428             extra_cflags=extra_cflags or [],
   1429             extra_cuda_cflags=extra_cuda_cflags or [],
   1430             extra_ldflags=extra_ldflags or [],
   1431             extra_include_paths=extra_include_paths or [],
   1432             build_directory=build_directory,
   1433             verbose=verbose,
   1434             with_cuda=with_cuda,
   1435             is_standalone=is_standalone)
   1436 finally:
   1437     baton.release()

File ~/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py:1537, in _write_ninja_file_and_build_library(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_standalone)
   1535 if verbose:
   1536     print(f'Building extension module {name}...')
-> 1537 _run_ninja_build(
   1538     build_directory,
   1539     verbose,
   1540     error_prefix=f"Error building extension '{name}'")

File ~/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py:1824, in _run_ninja_build(build_directory, verbose, error_prefix)
   1822 if hasattr(error, 'output') and error.output:  # type: ignore[union-attr]
   1823     message += f": {error.output.decode(*SUBPROCESS_DECODE_ARGS)}"  # type: ignore[union-attr]
-> 1824 raise RuntimeError(message) from e
RuntimeError: Error building extension 'fused': [1/2] /usr/bin/nvcc  -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -isystem /home/username/.local/lib/python3.10/site-packages/torch/include -isystem /home/username/.local/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/username/.local/lib/python3.10/site-packages/torch/include/TH -isystem /home/username/.local/lib/python3.10/site-packages/torch/include/THC -isystem /home/username/anaconda3/envs/sfe/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -std=c++14 -c /home/username/my_project/models/psp/stylegan2/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o 
FAILED: fused_bias_act_kernel.cuda.o 
/usr/bin/nvcc  -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -isystem /home/username/.local/lib/python3.10/site-packages/torch/include -isystem /home/username/.local/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/username/.local/lib/python3.10/site-packages/torch/include/TH -isystem /home/username/.local/lib/python3.10/site-packages/torch/include/THC -isystem /home/username/anaconda3/envs/sfe/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -std=c++14 -c /home/username/my_project/models/psp/stylegan2/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o 
/usr/include/c++/11/bits/std_function.h:435:145: error: parameter packs not expanded with ‘...’:
  435 |         function(_Functor&& __f)
      |                                                                                                                                                 ^ 
/usr/include/c++/11/bits/std_function.h:435:145: note:         ‘_ArgTypes’
/usr/include/c++/11/bits/std_function.h:530:146: error: parameter packs not expanded with ‘...’:
  530 |         operator=(_Functor&& __f)
      |                                                                                                                                                  ^ 
/usr/include/c++/11/bits/std_function.h:530:146: note:         ‘_ArgTypes’
ninja: build stopped: subcommand failed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant