Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while loading Mixtral based SFT MoE model VideoLLaMA2-8x7B: SafetensorError: Error while deserializing header: InvalidHeaderDeserialization #77

Open
ApoorvFrontera opened this issue Aug 20, 2024 · 4 comments

Comments

@ApoorvFrontera
Copy link

Hi Team,

When I am loading the Mixtral-based SFT MoE model 'DAMO-NLP-SG/VideoLLaMA2-8x7B' using the same inference code provided in the README.md, the following error is raised:

Traceback (most recent call last):
  File "/home/admin/apoorv/development/VideoLLaMA2/playground.py", line 33, in <module>
    inference()
  File "/home/admin/apoorv/development/VideoLLaMA2/playground.py", line 27, in inference
    model, processor, tokenizer = model_init(model_path)
  File "/home/admin/apoorv/development/VideoLLaMA2/videollama2/__init__.py", line 17, in model_init
    tokenizer, model, processor, context_len = load_pretrained_model(model_path, None, model_name, **kwargs)
  File "/home/admin/apoorv/development/VideoLLaMA2/videollama2/model/__init__.py", line 180, in load_pretrained_model
    model = Videollama2MixtralForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, config=config, **kwargs)
  File "/home/admin/.conda/envs/vl2/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3838, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/home/admin/.conda/envs/vl2/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4278, in _load_pretrained_model
    state_dict = load_state_dict(shard_file, is_quantized=is_quantized)
  File "/home/admin/.conda/envs/vl2/lib/python3.10/site-packages/transformers/modeling_utils.py", line 516, in load_state_dict
    with safe_open(checkpoint_file, framework="pt") as f:
safetensors_rust.SafetensorError: Error while deserializing header: InvalidHeaderDeserialization

I tried to find the reason for this and came across the following issues where the main reason was that the file/weights saved were not done correctly and had empty dictionaries which the safetensors can't handle.
huggingface/transformers#27397 (comment)

To solve this, there are some changes to be implemented before saving the model checkpoint by you guys:

  1. SafetensorError: Error while deserializing header: InvalidHeaderDeserialization when open .safetensor model huggingface/transformers#27397
  2. https://discuss.huggingface.co/t/safetensors-rust-safetenserror-while-deserializing-header-invalidheaderdeserialization/68831
  3. Fine-tuning example doesn't work for custom datasets slai-labs/get-beam#94
  4. https://huggingface.co/TheBloke/deepseek-coder-33B-instruct-AWQ/discussions/1
@ApoorvFrontera
Copy link
Author

Hi Team
Any response or help for this will be very useful.

Thanks in advance.

@HugoDelhaye
Copy link

I am having the same issus

@williamium3000
Copy link

same issues here.

@leoisufa
Copy link

leoisufa commented Nov 2, 2024

same issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants