-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
If it is possible to run inference with OVIS 1.6 on a single 4090 GPU? #22
Comments
The same question, is there a quantitative way of reasoning |
same issue |
Offload some layers of the visual tokenizer to the CPU using a device map.
It works on my 4090 with arguments of 41 and 6:
|
I run their HF demo snippet (https://huggingface.co/AIDC-AI/Ovis1.6-Gemma2-9B) on 3090 without issues. Ubuntu, ~500Mb VRAM in use before loading the model. ~21.7Gb during inference. And it is very good! |
I am trying to run it on a single 3090 Also add some details here: #31Thanks for FennelFetish's comment, this issue mentioned has been solved. |
the same |
Could anyone please advise if it is possible to run inference with OVIS 1.6 on a single 4090 GPU? After loading the model, it appears to consume approximately 20GB of VRAM. I attempted an inference, but the demo exited due to insufficient memory. Are there any solutions to this issue?
The text was updated successfully, but these errors were encountered: