Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sam 2 install failed "libcudnn.so.9: cannot open shared object file: No such file or directory" but libcudnn.so.9 is installed #129

Closed
loeweh opened this issue Aug 3, 2024 · 2 comments

Comments

@loeweh
Copy link

loeweh commented Aug 3, 2024

Hi,

this is a followup of issue #112 as I am not able to reopen the issue as not being a repo collaborator (see https://stackoverflow.com/questions/21333654/how-to-re-open-an-issue-in-github)

Did follow your advice. Some other libs were not found. tried the same fix :

 1004  export LD_LIBRARY_PATH=./venv/lib/python3.10/site-packages/nvidia/cudnn/lib:${LD_LIBRARY_PATH}
 1005  pip install -e .
 1006  find . -name libcupti.so.12
 1007  export LD_LIBRARY_PATH=./venv/lib/python3.10/site-packages/nvidia/cuda_cupti/lib:${LD_LIBRARY_PATH}
 1008  pip install -e .
 1009  find . -name libnccl.so.2
 1010  export LD_LIBRARY_PATH=./venv/lib/python3.10/site-packages/nvidia/nccl/lib/:${LD_LIBRARY_PATH}
 1011  find . -name libnccl.so.2
 1012  pip install -e .
 1013  sudo dnf install g++
 1014  pip install -e .
 1015  find . -iname python.h
 1016  cd ./venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/
 1017  ln -sf python.h Python.h
 1018  cd -
 1019  pip install -e .

after that the install kept running until this error

   /usr/local/cuda-12.2/include/crt/host_config.h:136:2: error: #error -- unsupported GNU version! gcc versions later than 12 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
        136 | #error -- unsupported GNU version! gcc versions later than 12 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
            |  ^~~~~
      In file included from /tmp/pip-build-env-uyr1owwh/overlay/lib64/python3.10/site-packages/torch/include/torch/csrc/Device.h:4,
                       from /tmp/pip-build-env-uyr1owwh/overlay/lib64/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/python.h:8,
                       from /tmp/pip-build-env-uyr1owwh/overlay/lib64/python3.10/site-packages/torch/include/torch/extension.h:9,
                       from sam2/csrc/connected_components.cu:12:
      /tmp/pip-build-env-uyr1owwh/overlay/lib64/python3.10/site-packages/torch/include/torch/csrc/python_headers.h:12:10: fatal error: Python.h: No such file or directory
         12 | #include <Python.h>
            |          ^~~~~~~~~~
      compilation terminated.

Seems to be the case that my gcc version gcc.x86_64 13.3.1-1.fc39 is not supported. Sadly I cannot downgrade.

Installed Packages
gcc.x86_64                        13.3.1-1.fc39                         @updates
Available Packages
gcc.x86_64                        13.2.1-3.fc39                         fedora  
gcc.x86_64                        13.3.1-1.fc39   

any idea when / if gcc 13 will be supported so that I am able to run SAM 2 on Fedora?
Best Regards

Heiko

@ronghanghu
Copy link
Contributor

Hi @loeweh, you seem to have a gcc version incompatible with CUDA kernel building in your system.

For video applications, if the compiler version issue cannot be resolved, to get you unblocked, you may skip the SAM 2 kernel building step by

  1. commenting out the line ext_modules=get_extensions(), in https://github.com/facebookresearch/segment-anything-2/blob/57bc94b7391e47e5968004a0698f8bf793a544d1/setup.py#L70`
  2. replace the fill_holes_in_mask_scores function in the lines https://github.com/facebookresearch/segment-anything-2/blob/57bc94b7391e47e5968004a0698f8bf793a544d1/sam2/utils/misc.py#L216-L227
    with a dummy one as follows
def fill_holes_in_mask_scores(mask, max_area):
    # do nothing here since we don't have a CUDA kernel for connected components
    return mask

This workaround disables the optional video post-processing step (small hole filling on GPUs) in SAM 2 and should unblock you for video applications (without the post-processing step using this CUDA kernel, it should still work in most cases).

However, the CUDA kernel above is also needed for automatic mask generation in static image applications of SAM 2. If your use case involves automatic mask generation, you would need to manually replace the get_connected_components function in the lines https://github.com/facebookresearch/segment-anything-2/blob/57bc94b7391e47e5968004a0698f8bf793a544d1/sam2/utils/misc.py#L47-L63 with e.g. an OpenCV implementation like cv2.connectedComponentsWithStats

@ronghanghu
Copy link
Contributor

Follow up: we have recently made the CUDA extension step optional (in #155) as a workaround to this problem.

You can pull the latest code and reinstall via

# run the line below inside the SAM 2 repo
git pull;
pip uninstall -y SAM-2;
rm -f sam2/*.so;
pip install -e ".[demo]"

which allows using SAM 2 without CUDA extension (the results should stay the same in most cases, see INSTALL.md for details).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants