You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 8, 2022. It is now read-only.
When I run the following command to fine-tune Quantized BERT on MRPC,
nlp-train transformer_glue
--task_name mrpc
--model_name_or_path bert-base-uncased
--model_type quant_bert
--learning_rate 2e-5
--output_dir /tmp/mrpc-8bit
--evaluate_during_training
--data_dir /path/to/MRPC
--do_lower_case
I get the following message:
INFO Weights of QuantizedBertForSequenceClassification not initialized from pretrained model: ['bert.embeddings.word_embeddings._step', 'bert.embeddings.position_embeddings._step', 'bert.embeddings.token_type_embeddings._step', 'bert.encoder.layer.0.attention.self.query._step', 'bert.encoder.layer.0.attention.self.query.input_thresh', 'bert.encoder.layer.0.attention.self.query.output_thresh', 'bert.encoder.layer.0.attention.self.key._step', 'bert.encoder.layer.0.attention.self.key.input_thresh', 'bert.encoder.layer.0.attention.self.key.output_thresh', 'bert.encoder.layer.0.attention.self.value._step', 'bert.encoder.layer.0.attention.self.value.input_thresh', 'bert.encoder.layer.0.attention.output.dense._step', 'bert.encoder.layer.0.attention.output.dense.input_thresh', 'bert.encoder.layer.0.intermediate.dense._step', 'bert.encoder.layer.0.intermediate.dense.input_thresh', 'bert.encoder.layer.0.output.dense._step', 'bert.encoder.layer.0.output.dense.input_thresh', 'bert.encoder.layer.1.attention.self.query._step',
... for all the layers. Can you please help figure out why all the weights are not initialized from the pretrained model? It works when I set model_type to bert instead of quant_bert.
Thanks a lot.
The text was updated successfully, but these errors were encountered:
Note that this message says that the input/output_threshold and _step attributes are not initialized from the pre-trained model which is OK since the pre-trained model wasn't trained with quantization in mind. If the weights weren't initialized you would see the list bert.encoder.layer.1.attention.self.query.weight and bert.encoder.layer.1.attention.self.query.bias.
The quantized FC layers that are used in the quantized BERT model requires more information such as the input threshold and the output threshold (the threshold is used to quantize the input and output tensors) which are not available in the pre-trained model.
Meaning everything is working correctly for you and when you will load a model you trained with quantization for inference you would see that these attributes are loaded from the quantized model.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
When I run the following command to fine-tune Quantized BERT on MRPC,
nlp-train transformer_glue
--task_name mrpc
--model_name_or_path bert-base-uncased
--model_type quant_bert
--learning_rate 2e-5
--output_dir /tmp/mrpc-8bit
--evaluate_during_training
--data_dir /path/to/MRPC
--do_lower_case
I get the following message:
INFO Weights of QuantizedBertForSequenceClassification not initialized from pretrained model: ['bert.embeddings.word_embeddings._step', 'bert.embeddings.position_embeddings._step', 'bert.embeddings.token_type_embeddings._step', 'bert.encoder.layer.0.attention.self.query._step', 'bert.encoder.layer.0.attention.self.query.input_thresh', 'bert.encoder.layer.0.attention.self.query.output_thresh', 'bert.encoder.layer.0.attention.self.key._step', 'bert.encoder.layer.0.attention.self.key.input_thresh', 'bert.encoder.layer.0.attention.self.key.output_thresh', 'bert.encoder.layer.0.attention.self.value._step', 'bert.encoder.layer.0.attention.self.value.input_thresh', 'bert.encoder.layer.0.attention.output.dense._step', 'bert.encoder.layer.0.attention.output.dense.input_thresh', 'bert.encoder.layer.0.intermediate.dense._step', 'bert.encoder.layer.0.intermediate.dense.input_thresh', 'bert.encoder.layer.0.output.dense._step', 'bert.encoder.layer.0.output.dense.input_thresh', 'bert.encoder.layer.1.attention.self.query._step',
... for all the layers. Can you please help figure out why all the weights are not initialized from the pretrained model? It works when I set model_type to bert instead of quant_bert.
Thanks a lot.
The text was updated successfully, but these errors were encountered: