Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: Tensor on device meta is not on the expected device cuda:0! #4

Open
bryanlinnan opened this issue Jan 10, 2025 · 7 comments

Comments

@bryanlinnan
Copy link

Hi bro,
During the "model.generate" in the script run_llava_mini.py, i met the error:

RuntimeError: Tensor on device meta is not on the expected device cuda:0!

btw CUDA is 11.6, GPU memory is 8G

My pip list is as below:
accelerate 0.29.0
addict 2.4.0
aiofiles 23.2.1
annotated-types 0.7.0
anyio 4.8.0
argon2-cffi 23.1.0
argon2-cffi-bindings 21.2.0
arrow 1.3.0
asttokens 2.4.1
async-lru 2.0.4
attrs 24.2.0
babel 2.16.0
bitsandbytes 0.45.0
bleach 6.1.0
cachetools 5.5.0
certifi 2024.12.14
cffi 1.17.1
charset-normalizer 3.3.2
click 8.1.8
comm 0.2.2
debugpy 1.8.5
decord 0.6.0
deepspeed 0.12.6
defusedxml 0.7.1
descartes 1.1.0
docker-pycreds 0.4.0
einops 0.6.1
einops-exts 0.0.4
exceptiongroup 1.2.2
executing 2.1.0
fastapi 0.115.6
fastjsonschema 2.20.0
ffmpy 0.5.0
filelock 3.16.1
fire 0.6.0
fqdn 1.5.1
fsspec 2024.12.0
gitdb 4.0.12
GitPython 3.1.44
gnupg 2.3.1
gradio 5.9.1
gradio_client 1.5.2
h11 0.14.0
hjson 3.1.0
httpcore 1.0.7
httpx 0.28.1
huggingface-hub 0.27.1
idna 3.10
importlib_metadata 8.4.0
ipykernel 6.29.5
ipython 8.27.0
ipywidgets 8.1.5
isoduration 20.11.0
jedi 0.19.1
Jinja2 3.1.4
joblib 1.4.2
json5 0.9.25
jsonpointer 3.0.0
jsonschema 4.23.0
jsonschema-specifications 2023.12.1
jupyter 1.1.1
jupyter_client 8.6.2
jupyter-console 6.6.3
jupyter_core 5.7.2
jupyter-events 0.10.0
jupyter-lsp 2.2.5
jupyter_server 2.14.2
jupyter_server_terminals 0.5.3
jupyterlab 4.2.5
jupyterlab_pygments 0.3.0
jupyterlab_server 2.27.3
jupyterlab_widgets 3.0.13
latex2mathml 3.77.0
llava_mini 1.0.0 /workspace/LLaVA-Mini-main
markdown-it-py 3.0.0
markdown2 2.5.2
MarkupSafe 2.1.5
matplotlib-inline 0.1.7
mdurl 0.1.2
mistune 3.0.2
mmcv-full 1.7.1
mmdet 2.28.2
mpmath 1.3.0
nbclient 0.10.0
nbconvert 7.16.4
nbformat 5.10.4
nest-asyncio 1.6.0
netifaces 0.11.0
networkx 3.4.2
ninja 1.11.1.3
notebook 7.2.2
notebook_shim 0.2.4
numpy 1.23.5
nuscenes-devkit 1.1.10
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-ml-py 12.560.30
nvidia-nccl-cu12 2.18.1
nvidia-nvjitlink-cu12 12.6.85
nvidia-nvtx-cu12 12.1.105
opencv-python 4.10.0.84
orjson 3.10.14
overrides 7.7.0
packaging 24.2
pandas 2.2.3
pandocfilters 1.5.1
parso 0.8.4
peft 0.14.0
pillow 11.1.0
pip 24.2
platformdirs 4.2.2
prometheus_client 0.20.0
prompt_toolkit 3.0.47
protobuf 5.29.3
psutil 6.1.1
pure_eval 0.2.3
py-cpuinfo 9.0.0
pycocotools 2.0.8
pycparser 2.22
pycryptodomex 3.21.0
pydantic 2.10.4
pydantic_core 2.27.2
pydub 0.25.1
Pygments 2.19.1
pynvml 12.0.0
pypcd 0.1.1
pyquaternion 0.9.9
python-dateutil 2.9.0.post0
python-json-logger 2.0.7
python-lzf 0.2.6
python-multipart 0.0.20
pytz 2024.2
PyYAML 6.0.2
pyzmq 26.2.0
referencing 0.35.1
regex 2024.11.6
requests 2.32.3
rfc3339-validator 0.1.4
rfc3986-validator 0.1.1
rich 13.9.4
rpds-py 0.20.0
ruff 0.8.6
safehttpx 0.1.6
safetensors 0.5.2
scikit-learn 1.2.2
scipy 1.15.0
semantic-version 2.10.0
Send2Trash 1.8.3
sentencepiece 0.1.99
sentry-sdk 2.19.2
setproctitle 1.3.4
setuptools 75.1.0
Shapely 1.8.5
shellingham 1.5.4
shortuuid 1.0.13
six 1.17.0
smmap 5.0.2
sniffio 1.3.1
stack-data 0.6.3
starlette 0.41.3
svgwrite 1.4.3
sympy 1.13.3
termcolor 2.4.0
terminado 0.18.1
terminaltables 3.1.10
threadpoolctl 3.5.0
timm 0.6.13
tinycss2 1.3.0
tokenizers 0.19.0
tomli 2.0.1
tomlkit 0.13.2
torch 2.1.2
torchvision 0.16.2
tornado 6.4.1
tqdm 4.66.5
traitlets 5.14.3
transformers 4.43.1
triton 2.1.0
typer 0.15.1
types-python-dateutil 2.9.0.20240821
typing_extensions 4.12.2
tzdata 2024.2
uri-template 1.3.0
urllib3 2.3.0
uvicorn 0.34.0
wandb 0.19.2
wavedrom 2.0.3.post3
wcwidth 0.2.13
webcolors 24.8.0
websocket-client 1.8.0
websockets 14.1
wheel 0.44.0
widgetsnbextension 4.0.13
yapf 0.40.2

Any idea about the cause?

@zhangshaolei1998
Copy link
Collaborator

This problem may be caused by the small GPU memory, which cannot load the entire model.
We have updated run_llava_mini.py. You can try to use --load-4bit or --load-8bit to see if it can run on 8GB of memory.

@bryanlinnan
Copy link
Author

This problem may be caused by the small GPU memory, which cannot load the entire model. We have updated run_llava_mini.py. You can try to use --load-4bit or --load-8bit to see if it can run on 8GB of memory.

Thx for the reply, i replace the run_llava_mini.py with updated file, and then got the error:

ValueError: You can't pass load_in_4bitor load_in_8bit as a kwarg when passing quantization_config argument at the same time.

@zhangshaolei1998
Copy link
Collaborator

Thanks for point out!
You should also update llavamini/model/builder.py, and use --load-8bit.

@bryanlinnan
Copy link
Author

Thanks for point out! You should also update llavamini/model/builder.py, and use --load-8bit.

Yes, it worked, thx again for the great open work

@IamShubhamGupto
Copy link

can we also do 8bit /4bit inference with the web ui demo? @zhangshaolei1998

@zhangshaolei1998
Copy link
Collaborator

can we also do 8bit /4bit inference

@IamShubhamGupto --load-8bit is available for web ui demo. Add it when starting the model worker.

@shubhamgupto
Copy link

Would it be possible to add support for Jetson orin nano runtime? I tried running the model in 4bit quantization after modifying the interface slightly and the OS seems to crash probably because of OOM

It would be really interesting to see this run on edge devices

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants