Issues using OpenHands w/ PyTorch + CUDA #4230

neubig · 2024-10-06T15:11:47Z

          Hi, @neubig  you can refer to here https://github.com/SmartManoj/Kevin/issues/65

the bug is, if you install anaconda in the image (using docker). the agent will need to exec conda init first. the conda init command require the user to close the terminal, however, the agent can not close it. so i try to let the agent exec source ./bashrc. Then, it stucks

# Stage 1: Prepare GPU environment based on NVIDIA CUDA image
FROM nvidia/cuda:12.6.1-devel-ubi8 AS cuda

# Stage 2: Use specified image
FROM ghcr.io/all-hands-ai/runtime:0.9-nikolaik

# Set non-interactive frontend
ENV DEBIAN_FRONTEND=noninteractive

# Install necessary dependencies, including sudo, g++, build-essential, and other common tools
RUN apt-get update && \
    apt-get install -y sudo g++ build-essential wget bzip2 ca-certificates \
    libglib2.0-0 libxext6 libsm6 libxrender1 git && \
    apt-get clean

# Copy CUDA toolchain and libraries from CUDA image
COPY --from=cuda /usr/local/cuda /usr/local/cuda

# Set CUDA environment variables
ENV PATH="/usr/local/cuda/bin:${PATH}"

# Explicitly initialize LD_LIBRARY_PATH to avoid undefined variable warnings
ENV LD_LIBRARY_PATH="/usr/local/cuda/lib64"

# Download and install specified version of Anaconda
RUN wget https://repo.anaconda.com/archive/Anaconda3-2024.06-1-Linux-x86_64.sh -O anaconda.sh && \
    bash anaconda.sh -b -p /opt/conda && \
    rm anaconda.sh

# Set conda environment variables
ENV PATH="/opt/conda/bin:${PATH}"

# Initialize conda
RUN conda init bash

# Create a new conda environment and install PyTorch
RUN conda create -n pytorch_env python=3.9 -y && \
    . /opt/conda/etc/profile.d/conda.sh && \
    conda activate pytorch_env && \
    conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia -y

# Set default conda environment
RUN echo "conda activate pytorch_env" >> ~/.bashrc

# Clean cache to reduce image size
RUN apt-get clean && rm -rf /var/lib/apt/lists/* && conda clean -a -y

# Set working directory
WORKDIR /workspace

# Set default command
CMD ["/bin/bash"]

# Initialize conda
RUN conda init

# Set default conda environment
RUN echo "conda activate pytorch_env" >> ~/.bashrc

Originally posted by @x66ccff in #2178 (comment)

The text was updated successfully, but these errors were encountered:

neubig · 2024-10-06T15:23:52Z

Thanks @x66ccff, I opened a separate issue so we can track this.

Just to clarify, are you following our custom sandbox guide?
https://docs.all-hands.dev/modules/usage/how-to/custom-sandbox-guide

And also, is your use case that you would like to develop programs with CUDA+PyTorch?
I'm just trying to better understand the situation so we can recommend the easiest way to fix the issue.

x66ccff · 2024-10-06T15:32:46Z

Just to clarify, are you following our custom sandbox guide? https://docs.all-hands.dev/modules/usage/how-to/custom-sandbox-guide

hmm, i didnt use the config.toml in the entire process. i m not sure whether it is correct. I just make a new docker image use the dockerfile above and then use this to start the openhands.

Seems editing config.toml is only necessary when build from source? not docker?

 docker run -it --pull=always   
  -e SANDBOX_RUNTIME_CONTAINER_IMAGE=kk-openhands-pytorch
  -e SANDBOX_USER_ID=$(id -u)    
 -e WORKSPACE_MOUNT_PATH=$WORKSPACE_BASE  
   -v $WORKSPACE_BASE:/opt/workspace_base 
    -v /var/run/docker.sock:/var/run/docker.sock  
   -p 3000:3000   
  --add-host host.docker.internal:host-gateway  
   --name openhands-app-$(date +%Y%m%d%H%M%S)  
   ghcr.io/all-hands-ai/openhands:0.9

And also, is your use case that you would like to develop programs with CUDA+PyTorch? I'm just trying to better understand the situation so we can recommend the easiest way to fix the issue.

Yeah, but I'm not very familiar with Docker. Any advice would be appreciated!

neubig · 2024-10-06T15:54:16Z

OK, I think that there might be an easier way for you to do this. Specifically, you could probably start working directly from one of the official pytorch docker images. You can pick which one best matches your expected version of PyTorch and do something like the following:

 docker run -it --pull=always   
  -e SANDBOX_BASE_CONTAINER_IMAGE=pytorch/pytorch:2.4.1-cuda11.8-cudnn9-runtime
  -e SANDBOX_USER_ID=$(id -u)    
 -e WORKSPACE_MOUNT_PATH=$WORKSPACE_BASE  
   -v $WORKSPACE_BASE:/opt/workspace_base 
    -v /var/run/docker.sock:/var/run/docker.sock  
   -p 3000:3000   
  --add-host host.docker.internal:host-gateway  
   --name openhands-app-$(date +%Y%m%d%H%M%S)  
   ghcr.io/all-hands-ai/openhands:0.9

Note that I changed SANDBOX_RUNTIME_CONTAINER_IMAGE to SANDBOX_BASE_CONTAINER_IMAGE, which will allow OpenHands to install its necessary software on top of the base image.

That may just work for your purposes, but if you still have issues I can help!

x66ccff · 2024-10-07T04:06:17Z

Sigh, i ve tried many commands and all failed using pytorch together with openhands. i also tried first copy a docker image from a running openhands container, it fails too.

i have tried these commands

❌

 docker run --gpus all -it --pull=always   \
  -e SANDBOX_BASE_CONTAINER_IMAGE=pytorch/pytorch:2.4.1-cuda12.4-cudnn9-devel \
  -e SANDBOX_USER_ID=$(id -u)    \
 -e WORKSPACE_MOUNT_PATH=$WORKSPACE_BASE  \
   -v $WORKSPACE_BASE:/opt/workspace_base \
    -v /var/run/docker.sock:/var/run/docker.sock  \
   -p 7127:3000   \
  --add-host host.docker.internal:host-gateway  \
   --name openhands-app-$(date +%Y%m%d%H%M%S)  \
   ghcr.io/all-hands-ai/openhands:0.9

❌

 docker run --gpus all -it --pull=always   \
  -e SANDBOX_BASE_CONTAINER_IMAGE=pytorch/pytorch:2.4.1-cuda12.4-cudnn9-runtime \
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=ghcr.io/all-hands-ai/runtime:0.9-nikolaik \
  -e SANDBOX_USER_ID=$(id -u)    \
 -e WORKSPACE_MOUNT_PATH=$WORKSPACE_BASE  \
   -v $WORKSPACE_BASE:/opt/workspace_base \
    -v /var/run/docker.sock:/var/run/docker.sock  \
   -p 7127:3000   \
  --add-host host.docker.internal:host-gateway  \
   --name openhands-app-$(date +%Y%m%d%H%M%S)  \
   ghcr.io/all-hands-ai/openhands:0.9

❌

 docker run --gpus all -it --pull=always   \
  -e SANDBOX_BASE_CONTAINER_IMAGE=pytorch/pytorch:2.4.1-cuda12.4-cudnn9-runtime \
  -e SANDBOX_USER_ID=$(id -u)    \
 -e WORKSPACE_MOUNT_PATH=$WORKSPACE_BASE  \
   -v $WORKSPACE_BASE:/opt/workspace_base \
    -v /var/run/docker.sock:/var/run/docker.sock  \
   -p 7127:3000   \
  --add-host host.docker.internal:host-gateway  \
   --name openhands-app-$(date +%Y%m%d%H%M%S)  \
   ghcr.io/all-hands-ai/openhands:0.9.7

❌

 docker run --gpus all -it --pull=always   \
  -e SANDBOX_BASE_CONTAINER_IMAGE=kk_openhands_pytorch_nvidiactk \
  -e SANDBOX_USER_ID=$(id -u)    \
 -e WORKSPACE_MOUNT_PATH=$WORKSPACE_BASE  \
   -v $WORKSPACE_BASE:/opt/workspace_base \
    -v /var/run/docker.sock:/var/run/docker.sock  \
   -p 7127:3000   \
  --add-host host.docker.internal:host-gateway  \
   --name openhands-app-$(date +%Y%m%d%H%M%S)  \
   ghcr.io/all-hands-ai/openhands:0.9

❌

 docker run -it --gpus all  --pull=always    \
 -e  SANDBOX_RUNTIME_CONTAINER_IMAGE=kk_openhands_pytorch_nvidiactk     
\  -e SANDBOX_USER_ID=$(id -u)     -e WORKSPACE_MOUNT_PATH=$WORKSPACE_BASE    
\ -v $WORKSPACE_BASE:/opt/workspace_base     -v /var/run/docker.sock:/var/run/docker.sock   
\  -p 7127:3000  
\   --add-host host.docker.internal:host-gateway    
\ --name openhands-app-$(date +%Y%m%d%H%M%S)     ghcr.io/all-hands-ai/openhands:0.9

Here i encountered 3 types of errors:

when using pytorch/pytorch:2.4.1-cuda12.4-cudnn9-devel, i will run into

03:50:19 - openhands:ERROR: docker.py:130 - Python executable not found: [Errno 2] No such file or directory: 'docker'
03:50:19 - openhands:ERROR: runtime_build.py:383 - Sandbox image build failed: [Errno 2] No such file or directory: 'docker'
03:50:19 - openhands:ERROR: agent_session.py:194 - Runtime initialization failed: [Errno 2] No such file or directory: 'docker'
03:50:19 - openhands:ERROR: agent_session.py:84 - Error starting session: [Errno 2] No such file or directory: 'docker'

when using kk_openhands_pytorch_nvidiactk —— this is a docker image build with dockerfile like this

# Stage 1: Configure PyTorch and CUDA environment based on PyTorch image
FROM pytorch/pytorch:2.4.1-cuda12.4-cudnn9-runtime AS pytorch

# Set non-interactive frontend
ENV DEBIAN_FRONTEND=noninteractive

# Install necessary dependencies, including sudo, g++, wget, and other common tools
RUN apt-get update && \
    apt-get install -y sudo g++ wget bzip2 ca-certificates \
    libglib2.0-0 libxext6 libsm6 libxrender1 git && \
    apt-get clean

# Stage 2: Use the all-hands-ai image and copy the PyTorch and CUDA environment
FROM ghcr.io/all-hands-ai/runtime:0.9-nikolaik

# Set non-interactive frontend
ENV DEBIAN_FRONTEND=noninteractive

# Install necessary dependencies, including sudo, g++, build-essential, wget, bzip2, and other common tools
RUN apt-get update && \
    apt-get install -y sudo g++ build-essential wget bzip2 ca-certificates \
    libglib2.0-0 libxext6 libsm6 libxrender1 git && \
    apt-get clean

# Copy necessary PyTorch environment from the PyTorch image
COPY --from=pytorch /opt/conda /opt/conda

# Set Conda environment variables
ENV PATH="/opt/conda/bin:${PATH}"

# Initialize conda
RUN conda init bash

# Create a new conda environment and install PyTorch (if updates or modifications are needed)
RUN conda create -n pytorch_env python=3.9 -y && \
    . /opt/conda/etc/profile.d/conda.sh && \
    conda activate pytorch_env && \
    conda install pytorch torchvision torchaudio -c pytorch -y

# Add all users to sudoers and allow passwordless sudo
RUN echo "ALL ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers

# Set the default conda environment
RUN echo "conda activate pytorch_env" >> ~/.bashrc

# Clean up caches to reduce image size
RUN apt-get clean && rm -rf /var/lib/apt/lists/* && conda clean -a -y

# Set the working directory
WORKDIR /workspace

# Set the default command
CMD ["/bin/bash"]

I have run docker run --rm --gpus all kk_openhands_pytorch nvidia-smi to test this image, and it can print this successfully

Mon Oct  7 04:01:29 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03              Driver Version: 560.35.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3090        On  |   00000000:17:00.0 Off |                  N/A |
| 75%   34C    P8             26W /  250W |   20960MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
....

however, using this docker image together with openhands, even with --gpus all option will always get

nvidia-smi command not found

If i clone the openhands instance and manually install nvidia tools inside the image, then use this image again with the openhands, i will get

$ nvidia-smi
Failed to initialize NVML: Unknown Error

neubig · 2024-10-07T04:37:32Z

OK, thanks for the detailed report. We'll try to figure this out ASAP!

xingyaoww · 2024-10-07T12:22:59Z

For these errors, are you facing these errors using the docker run command?
If so, can you actually try using make run from https://github.com/All-Hands-AI/OpenHands/blob/main/Development.md to launch it instead?

03:50:19 - openhands:ERROR: docker.py:130 - Python executable not found: [Errno 2] No such file or directory: 'docker'
03:50:19 - openhands:ERROR: runtime_build.py:383 - Sandbox image build failed: [Errno 2] No such file or directory: 'docker'
03:50:19 - openhands:ERROR: agent_session.py:194 - Runtime initialization failed: [Errno 2] No such file or directory: 'docker'
03:50:19 - openhands:ERROR: agent_session.py:84 - Error starting session: [Errno 2] No such file or directory: 'docker'

The docker not found the issue is likely related to docker not being installed "inside" the app container (e.g., ghcr.io/all-hands-ai/openhands).

Related convo here: https://openhands-ai.slack.com/archives/C078L0FUGUX/p1727972321900899?thread_ts=1727917602.417709&cid=C078L0FUGUX

cc @mamoodi if he has any idea

mamoodi · 2024-10-07T14:37:10Z

Setting the sandbox_base_container_runtime is not supported through the docker command:
#4220

You must use the development workflow until that is implemented

mamoodi · 2024-10-10T12:44:45Z

@x66ccff merged in a change. Can you try running with:
ghcr.io/all-hands-ai/openhands:main
and see if it works now?

The method of 1.when using pytorch/pytorch:2.4.1-cuda12.4-cudnn9-devel, i will run into step, you shouldn't get those errors.

I'll just update the documentation and close this hopefully.

x66ccff · 2024-10-10T13:00:42Z

@mamoodi hi, i got this error

(openhands) kent@kent-Super-Server:~/_Project/openhands$  docker run --gpus all -it --pull=always   \
  -e SANDBOX_BASE_CONTAINER_IMAGE=pytorch/pytorch:2.4.1-cuda12.4-cudnn9-devel \
  -e SANDBOX_USER_ID=$(id -u)    \
 -e WORKSPACE_MOUNT_PATH=$WORKSPACE_BASE  \
   -v $WORKSPACE_BASE:/opt/workspace_base \
    -v /var/run/docker.sock:/var/run/docker.sock  \
   -p 7127:3000   \
  --add-host host.docker.internal:host-gateway  \
   --name openhands-app-$(date +%Y%m%d%H%M%S)  \
   ghcr.io/all-hands-ai/openhands:main

main: Pulling from all-hands-ai/openhands
Digest: sha256:d3a4ed8b661b3a0e65830e97411ad18eb7f0e519e0f7442b4fb5c97042d917a6
Status: Image is up to date for ghcr.io/all-hands-ai/openhands:main
Starting OpenHands...
Setting up enduser with id 1000
Docker socket group id: 983
Creating group with id 983
groupadd: group 'docker' already exists

Initial pull

(openhands) kent@kent-Super-Server:~/_Project/openhands$  docker run --gpus all -it --pull=always   \
  -e SANDBOX_BASE_CONTAINER_IMAGE=pytorch/pytorch:2.4.1-cuda12.4-cudnn9-devel \
  -e SANDBOX_USER_ID=$(id -u)    \
 -e WORKSPACE_MOUNT_PATH=$WORKSPACE_BASE  \
   -v $WORKSPACE_BASE:/opt/workspace_base \
    -v /var/run/docker.sock:/var/run/docker.sock  \
   -p 7127:3000   \
  --add-host host.docker.internal:host-gateway  \
   --name openhands-app-$(date +%Y%m%d%H%M%S)  \
   ghcr.io/all-hands-ai/openhands:main
main: Pulling from all-hands-ai/openhands
09f376ebb190: Already exists
276709cbedc1: Already exists
2e133733af76: Already exists
ded8879d9a79: Already exists
3cf9507408dc: Already exists
0edbbd94250f: Already exists
e494d0176b10: Already exists
5fa558507e5e: Already exists
231ac2d3b56e: Pull complete
53f50c4d33b4: Pull complete
d4013f78c628: Pull complete
09a13691b725: Pull complete                                                                        136ca1f6741b: Pull complete                                                                        c0af04535154: Pull complete
41e03b1234d4: Pull complete
d963e1867e31: Pull complete
d1a32222dcaf: Pull complete                                                                        381ad02f0168: Pull complete                                                                        fe1889fba864: Pull complete                                                                        4f4fb700ef54: Pull complete                                                                        33c03875d330: Pull complete
071cab0c0483: Pull complete                                                                        07afac7ced9b: Pull complete
8782a62765f5: Pull complete
1340567b9141: Pull complete
ba310f50634e: Pull complete
a51970467f4d: Pull complete
d8fb29726c90: Pull complete
a0d1c6768975: Pull complete
Digest: sha256:d3a4ed8b661b3a0e65830e97411ad18eb7f0e519e0f7442b4fb5c97042d917a6
Status: Downloaded newer image for ghcr.io/all-hands-ai/openhands:main
Starting OpenHands...
Setting up enduser with id 1000
Docker socket group id: 983
Creating group with id 983
groupadd: group 'docker' already exists

mamoodi · 2024-10-10T13:27:56Z

What's the error? Did you try accessing localhost:3000 now?

x66ccff · 2024-10-10T13:39:57Z

@mamoodi Well, the error is just groupadd: group 'docker' already exists and the images cannot start.

Even when i try the original recommened cmd

docker run -it --pull=always     -e SANDBOX_RUNTIME_CONTAINER_IMAGE=ghcr.io/all-hands-ai/runtime:0.9-nikolaik     -e SANDBOX_USER_ID=$(id -u)     -e WORKSPACE_MOUNT_PATH=$WORKSPACE_BASE     -v $WORKSPACE_BASE:/opt/workspace_base     -v /var/run/docker.sock:/var/run/docker.sock     -p 7127:3000     --add-host host.docker.internal:host-gat
eway     --name openhands-app-$(date +%Y%m%d%H%M%S)     ghcr.io/all-hands-ai/openhands:0.9

0.9: Pulling from all-hands-ai/openhands
Digest: sha256:1488932730c3897bd3aa0b594fae0f19adabf6e12c01ec43325ad07f0ac3e179
Status: Image is up to date for ghcr.io/all-hands-ai/openhands:0.9

✔
this works fine. however when i use openhands:main i got

groupadd: group 'docker' already exists

❌

Did you try accessing localhost:3000 now?

the 3000 port is already used by another app, so i use 7127. but i don think this is the reason, because 0.9 version works fine

mamoodi · 2024-10-10T14:27:15Z

Hmmm.... I'm unsure. Looking at the bug description, Graham says:
"if you install anaconda in the image (using docker)". Where are you doing that? In the APP image?

x66ccff · 2024-10-10T14:29:09Z

@mamoodi

Hmmm.... I'm unsure. Looking at the bug description, Graham says: "if you install anaconda in the image (using docker)". Where are you doing that? In the APP image?

Yeah.

x66ccff · 2024-10-10T14:30:06Z

i just want the agent can manage conda env in the openhands image

mamoodi · 2024-10-10T14:57:17Z

Something is going wrong here:

OpenHands/containers/app/entrypoint.sh

Lines 48 to 53 in 2d2d3cc

    
           if getent group $DOCKER_SOCKET_GID; then 
        
             echo "Group with id $DOCKER_SOCKET_GID already exists" 
        
           else 
        
             echo "Creating group with id $DOCKER_SOCKET_GID" 
        
             groupadd -g $DOCKER_SOCKET_GID docker 
        
           fi

It seems like the if getent group DOCKER_SOCKET_GID is returning false, even though the group exists...and it tries to create it.

Are you able to exec into the container at all or it's not starting in your docker desktop? Just want to know what:
DOCKER_SOCKET_GID=$(stat -c '%g' /var/run/docker.sock)
getent group $DOCKER_SOCKET_GID;

returns

x66ccff · 2024-10-10T15:13:13Z

@mamoodi no, It just won't start.

github-actions · 2024-11-12T01:57:15Z

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions · 2024-11-19T02:02:53Z

This issue was closed because it has been stalled for over 30 days with no activity.

BradKML · 2025-01-01T13:02:17Z

One thing I am trying to crack, is getting OpenHands (within the Docker container) to access the GPU on the machine. It often does not pop up even when nVidia packages are working outside of the container (and --gpus=all), which makes the agent wanted to start a Docker instance with the current Docker instance (Docker-ception). OpenHands would have a hard time learning ML. @x66ccff thanks for all the testing and I would like @SmartManoj to maybe take a peek at this
P.S. somewhat related #1020

SmartManoj · 2025-01-01T13:21:03Z

@BradKML Could you pass the device_requests arg here like this and run in development mode?

BradKML · 2025-01-02T03:38:06Z

@SmartManoj how would that command fit in to the containers.run? As in the device_requests parameter with [docker.types.DeviceRequest(device_ids=["0,2"], capabilities=[['gpu']])]? Why would the device_ids defaults to like ["0,2"]? (like the default for laptops GPU would be 0)
P.S. why is docker compose not used like THIS or THIS?

    deploy:
      resources:
        reservations:
          devices:
            - capabilities: [gpu]

P.S. thanks @neubig for bringing is issue back up

SmartManoj · 2025-01-02T05:17:50Z

@BradKML That's a typo. It should be ['0', '2'] which is not the default. The OP in that question used the 1st and 3rd GPUs only.

count (int): Number or devices to request. Optional.
    Set to -1 to request all available devices.
device_ids (list): List of strings for device IDs. Optional.
    Set either ``count`` or ``device_ids``.

Source

device_requests = [docker.types.DeviceRequest(
    count=-1,                   # Allocate all available GPUs
    capabilities=[['gpu']]      # Specify GPU capability
)]

BradKML · 2025-01-02T08:55:26Z

@SmartManoj sorry but then I hit this when I tried to make build in WSL #5529 https://github.com/All-Hands-AI/OpenHands/blob/main/Development.md

SmartManoj · 2025-01-02T09:00:00Z

Are you using Devcontainer like in #5529? SmartManoj#122 (comment)

BradKML · 2025-01-02T09:01:24Z

@SmartManoj directly make build instead of starting a devcointainer, but have the errors. After commenting out the code it is functional now, proceeding with testing this

BradKML · 2025-01-02T09:15:43Z

Sorry but got another error issue @SmartManoj https://pastebin.com/bGtBgSV2

SmartManoj · 2025-01-02T09:17:57Z

Could you comment this line and check?

BradKML · 2025-01-02T09:39:32Z

Will check and see
Take a look at this error, will need to see where I need to import the class https://pastebin.com/sM45mNzc

SmartManoj · 2025-01-02T09:41:16Z

Could you replace it with docker.types.DeviceRequest?

BradKML · 2025-01-02T09:50:26Z

For 2 @SmartManoj this happened https://pastebin.com/5eQfMRMk

SmartManoj · 2025-01-02T09:51:59Z

Oops, it's plural device_requests.

BradKML · 2025-01-02T09:56:44Z

TypeError: Invalid type for device_requests param: expected list but found <class 'docker.types.containers.DeviceRequest'> when I added device_requests=docker.types.DeviceRequest(count=-1,capabilities=[['gpu']]), (note: haven't commented out the other line yet)

SmartManoj · 2025-01-02T09:57:51Z

value is also a list device_requests=[docker.types.DeviceRequest(count=-1,capabilities=[['gpu']])]

BradKML · 2025-01-02T10:02:13Z

@SmartManoj welp made the change and no errors about that, but something else cropped up https://pastebin.com/1BGTNZkU

SmartManoj · 2025-01-02T10:05:45Z

Is this happening in the initial version too?

BradKML · 2025-01-02T10:09:37Z

Which version? That one did not comment out the _container_port line, but after commenting that line out, load and re-start session (first session is broken), another launch shows that it is functional. Testing if container spots the GPU.

Also, GPU is now detected in the container so thx, the changed needs to be added to the next version of OpenHands. But I don't get it @SmartManoj why comment that specific line?

SmartManoj · 2025-01-02T10:30:42Z

Known bug #5943 #5964

Fixes All-Hands-AI#4230

BradKML · 2025-01-02T11:33:22Z

@neubig adding that one line is enough to fix the sandbox, I have to thank @SmartManoj for that

neubig · 2025-01-04T23:03:04Z

Thanks @BradKML !

@SmartManoj if you want to send a PR for this we'd welcome one: SmartManoj@a24a00d

Fixes All-Hands-AI#4230 (cherry picked from commit a24a00d)

neubig changed the title ~~Issues using OpenHands w/ Conda~~ Issues using OpenHands w/ PyTorch + CUDA Oct 6, 2024

mamoodi mentioned this issue Oct 9, 2024

Install docker in the OpenHands app image #4283

Merged

1 task

github-actions bot added the Stale Inactive for 30 days label Nov 12, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 19, 2024

neubig removed the Stale Inactive for 30 days label Jan 2, 2025

neubig reopened this Jan 2, 2025

SmartManoj added a commit to SmartManoj/Kevin that referenced this issue Jan 2, 2025

Enable GPU in sandbox

a24a00d

Fixes All-Hands-AI#4230

SmartManoj mentioned this issue Jan 2, 2025

[Bug]: UI breaks when sandbox use_host_network = true SmartManoj/Kevin#163

Closed

neubig self-assigned this Jan 4, 2025

SmartManoj added a commit to SmartManoj/Kevin that referenced this issue Jan 5, 2025

Enable GPU in sandbox

d63c580

Fixes All-Hands-AI#4230 (cherry picked from commit a24a00d)

SmartManoj mentioned this issue Jan 5, 2025

feat: Add GPU support #6042

Merged

neubig closed this as completed in #6042 Jan 5, 2025

neubig mentioned this issue Jan 5, 2025

Fix issue #6046: Document GPU usage #6047

Closed

Issues using OpenHands w/ PyTorch + CUDA #4230

Issues using OpenHands w/ PyTorch + CUDA #4230

Comments

neubig commented Oct 6, 2024 • edited Loading

neubig commented Oct 6, 2024

x66ccff commented Oct 6, 2024 • edited Loading

neubig commented Oct 6, 2024 • edited Loading

x66ccff commented Oct 7, 2024 • edited Loading

Here i encountered 3 types of errors:

neubig commented Oct 7, 2024

xingyaoww commented Oct 7, 2024

mamoodi commented Oct 7, 2024

mamoodi commented Oct 10, 2024

x66ccff commented Oct 10, 2024 • edited Loading

mamoodi commented Oct 10, 2024

x66ccff commented Oct 10, 2024 • edited Loading

mamoodi commented Oct 10, 2024

x66ccff commented Oct 10, 2024 • edited Loading

x66ccff commented Oct 10, 2024

mamoodi commented Oct 10, 2024 • edited Loading

x66ccff commented Oct 10, 2024 • edited Loading

github-actions bot commented Nov 12, 2024

github-actions bot commented Nov 19, 2024

BradKML commented Jan 1, 2025 • edited Loading

SmartManoj commented Jan 1, 2025

BradKML commented Jan 2, 2025 • edited Loading

SmartManoj commented Jan 2, 2025 • edited Loading

BradKML commented Jan 2, 2025

SmartManoj commented Jan 2, 2025

BradKML commented Jan 2, 2025 • edited Loading

BradKML commented Jan 2, 2025

SmartManoj commented Jan 2, 2025

BradKML commented Jan 2, 2025

SmartManoj commented Jan 2, 2025

BradKML commented Jan 2, 2025

SmartManoj commented Jan 2, 2025

BradKML commented Jan 2, 2025 • edited Loading

SmartManoj commented Jan 2, 2025 • edited Loading

BradKML commented Jan 2, 2025

SmartManoj commented Jan 2, 2025 • edited Loading

BradKML commented Jan 2, 2025 • edited Loading

SmartManoj commented Jan 2, 2025

BradKML commented Jan 2, 2025

neubig commented Jan 4, 2025

neubig commented Oct 6, 2024 •

edited

Loading

x66ccff commented Oct 6, 2024 •

edited

Loading

neubig commented Oct 6, 2024 •

edited

Loading

x66ccff commented Oct 7, 2024 •

edited

Loading

x66ccff commented Oct 10, 2024 •

edited

Loading

x66ccff commented Oct 10, 2024 •

edited

Loading

x66ccff commented Oct 10, 2024 •

edited

Loading

mamoodi commented Oct 10, 2024 •

edited

Loading

x66ccff commented Oct 10, 2024 •

edited

Loading

BradKML commented Jan 1, 2025 •

edited

Loading

BradKML commented Jan 2, 2025 •

edited

Loading

SmartManoj commented Jan 2, 2025 •

edited

Loading

BradKML commented Jan 2, 2025 •

edited

Loading

BradKML commented Jan 2, 2025 •

edited

Loading

SmartManoj commented Jan 2, 2025 •

edited

Loading

SmartManoj commented Jan 2, 2025 •

edited

Loading

BradKML commented Jan 2, 2025 •

edited

Loading