You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am facing issues while loading all the base models in 4-bit precision. The following lines try to load the mm_projector_weights which are stored in 16-bit precision into a model that requires the weights in 4bit leading to errors:
RuntimeError: Error(s) in loading state_dict for Videollama2MistralForCausalLM:
size mismatch for model.mm_projector.readout.0.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]).
size mismatch for model.mm_projector.readout.2.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]).
How can we use the 16-bit stored weights of the mm_projector_weights in 4-bit models?
The text was updated successfully, but these errors were encountered:
According to this document BitsAndBytesConfig, all of the linear layers will be replaced by FP4/NF4 layers if setting load_4bit. Therefore, it reports size mismatch error.
A temporary solution is to initialize a unquantified model, load projector weights, and save the whole model weights. The saved weights can be loaded successfully with load_4bit=True.
Hi VideoLLaMA Team,
I am facing issues while loading all the base models in 4-bit precision. The following lines try to load the
mm_projector_weights
which are stored in 16-bit precision into a model that requires the weights in 4bit leading to errors:Code used for loading the models for inference
Problematic part of the Code:
Lines: https://github.com/DAMO-NLP-SG/VideoLLaMA2/blob/main/videollama2/model/__init__.py#L171-L172
Error:
How can we use the 16-bit stored weights of the
mm_projector_weights
in 4-bit models?The text was updated successfully, but these errors were encountered: