get embeddings error #4

pillowill · 2024-06-29T05:44:20Z

dear author:
when i use the command :"python MLM_SFP.py --pretraining bert_mul_2.pth --data_embedding my_rna.fa --embedding_output rRNABert_emb.csv --batch 40"
i met the following errors:
RuntimeError: Error(s) in loading state_dict for BertForMaskedLM:
Missing key(s) in state_dict: "bert.embeddings.word_embeddings.weight", "bert.embeddings.position_embeddings.weight", "bert.embeddings.token_type_embeddings.weight", "bert.embeddings.LayerNorm.gamma", "bert.embeddings.LayerNorm.beta", "bert.encoder.layer.0.attention.selfattn.query.weight", "bert.encoder.layer.0.attention.selfattn.query.bias", "bert.encoder.layer.0.attention.selfa...

sunyunlee · 2024-11-20T11:09:50Z

Hi I am getting the same error message when trying to extract embeddings from the pre trained model without fine tuning it. I’m assuming it has to do with the discrepancy between the initialized model and the existing weights. Has the issue been addressed/fixed? Thanks in advance.

sunyunlee · 2024-11-20T13:47:45Z

I was able to figure out the issue. The issue is that the OrderedDict in the pretrained file has a different parameter names than the ones the Bert class object was expecting. It has an additional word.

import torch
from collections import OrderedDict

file_path = 'bert_mul_2.pth'

state_dict = torch.load(file_path, map_location="cpu")

new_state_dict = OrderedDict()

for key, value in state_dict.items():
    # Modify the key as needed
    new_key = ".".join(key.split(".")[1:])
    new_state_dict[new_key] = value.clone()

torch.save(new_state_dict, 'bert_mul_2_correction.pth')

for key in new_state_dict.keys():
    print(key)

ran this first to get a new weight file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

get embeddings error #4

get embeddings error #4

pillowill commented Jun 29, 2024

sunyunlee commented Nov 20, 2024 •

edited

Loading

sunyunlee commented Nov 20, 2024 •

edited

Loading

get embeddings error #4

get embeddings error #4

Comments

pillowill commented Jun 29, 2024

sunyunlee commented Nov 20, 2024 • edited Loading

sunyunlee commented Nov 20, 2024 • edited Loading

sunyunlee commented Nov 20, 2024 •

edited

Loading

sunyunlee commented Nov 20, 2024 •

edited

Loading