Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot load model in 4bit with VideoLLaMA2.1-7B-AV model #151

Open
mertonmeng opened this issue Jan 28, 2025 · 0 comments
Open

Cannot load model in 4bit with VideoLLaMA2.1-7B-AV model #151

mertonmeng opened this issue Jan 28, 2025 · 0 comments

Comments

@mertonmeng
Copy link

Hi team,

I am having trouble to load the VideoLLaMA2.1-7B-AV model in 4 bit
Below is the error:

Exception has occurred: RuntimeError
Only Tensors created explicitly by the user (graph leaves) support the deepcopy protocol at the moment.  If you were attempting to deepcopy a module, this may be because of a torch.nn.utils.weight_norm usage, see https://github.com/pytorch/pytorch/pull/103001
  File "/home/ubuntu/workspace/VideoLLaMA2/videollama2/model/__init__.py", line 182, in load_pretrained_model
    model = Videollama2Qwen2ForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, config=config, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/workspace/VideoLLaMA2/videollama2/__init__.py", line 17, in model_init
    tokenizer, model, processor, context_len = load_pretrained_model(model_path, None, model_name, **kwargs)
                                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/workspace/VideoLLaMA2/inference_demo.py", line 10, in inference
    model, processor, tokenizer = model_init(model_path, load_4bit=True)
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/workspace/VideoLLaMA2/inference_demo.py", line 66, in <module>
    inference(args)
RuntimeError: Only Tensors created explicitly by the user (graph leaves) support the deepcopy protocol at the moment.  If you were attempting to deepcopy a module, this may be because of a torch.nn.utils.weight_norm usage, see https://github.com/pytorch/pytorch/pull/103001

Appreciate any suggestions.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant