You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
will output Some weights of MobileLLMForCausalLM were not initialized from the model checkpoint at facebook/MobileLLM-125M and are newly initialized: ['model.embed_tokens.weight']
and the weights will be random. when using use_safetensor=False. everything seems to work as expected.
Expected behavior
using safetensors should work the same as when not using them.
The text was updated successfully, but these errors were encountered:
There should be a single "error" about lm_head.weight, since the model uses weight tieing for the embeeding and output layer. Both safetensors and normal loading does this.
the problem is that when using safetensors the embedding layer seems to be missing which causes problems with both the embedding layer and the output layer.
maybe I should have been more clear about that in the bug report (sorry about that).
System Info
transformers
version: 4.46.2Who can help?
@ArthurZucker
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
mobilellm = AutoModelForCausalLM.from_pretrained("facebook/MobileLLM-125M",trust_remote_code=True)
will output
Some weights of MobileLLMForCausalLM were not initialized from the model checkpoint at facebook/MobileLLM-125M and are newly initialized: ['model.embed_tokens.weight']
and the weights will be random. when using
use_safetensor=False
. everything seems to work as expected.Expected behavior
using safetensors should work the same as when not using them.
The text was updated successfully, but these errors were encountered: