Model is not initialized correctly when path to a pretrained model is provided via `pre_trained` #146

ThuongTNguyen · 2023-12-09T00:25:00Z

Description

I use a script similar to cola.sh to train and/or evaluate a model for sequence classification.
There are two possible parameters for model state files init_model and pre_trained.
I want and expect the model to be loaded with weights from pre_trained when provided while vocabulary is loaded based on init_model if init_model is one of the provided pretrained models.
However, the model parameters are actually loaded using init_model only. That's because pre_trained flag doesn't have an effect in this fucntion, although I expect pre_trained should override init_model.

Steps to reproduce

Set init_model to deberta-v3-base
Set pre_trained to $PATH_TO_MY_MODEL, which is a path to the pretrained mDeBERTa-V3-Base for example
Check the model parameter after loading, e.g print(model.deberta.encoder.layer[7].output.dense.weight[:5,:4]) after this line
- Expected result (mDeBERTa-v3-base):
  tensor([[-0.0212, 0.0130, 0.0446, 0.0156],
  [ 0.0811, 0.0023, 0.0057, -0.0301],
  [-0.0190, 0.0097, -0.0114, 0.0306],
  [ 0.0049, -0.0174, 0.0064, -0.0275],
  [-0.0152, -0.0411, -0.0166, -0.0447]], dtype=torch.float16)
- Actual result (DeBERTa-v3-base):
  tensor([[ 0.0278, -0.0206, -0.0062, 0.0368],
  [ 0.0262, -0.0676, 0.0477, 0.0249],
  [-0.0364, 0.0453, 0.0912, 0.0590],
  [-0.0638, 0.0402, 0.0272, -0.0013],
  [-0.0352, -0.0579, 0.0320, 0.0003]], grad_fn=)

Additional information/Environment

My system setup is:

PyTorch 1.10.0+cu113
GPU: NVIDIA GeForce GTX 1080 Ti

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model is not initialized correctly when path to a pretrained model is provided via `pre_trained` #146

Model is not initialized correctly when path to a pretrained model is provided via `pre_trained` #146

ThuongTNguyen commented Dec 9, 2023

Model is not initialized correctly when path to a pretrained model is provided via pre_trained #146

Model is not initialized correctly when path to a pretrained model is provided via pre_trained #146

Comments

ThuongTNguyen commented Dec 9, 2023

Description

Steps to reproduce

Additional information/Environment

Model is not initialized correctly when path to a pretrained model is provided via `pre_trained` #146

Model is not initialized correctly when path to a pretrained model is provided via `pre_trained` #146