Loading quant_bert pretrained weights for MRPC #150
Description
When I run the following command to fine-tune Quantized BERT on MRPC,
nlp-train transformer_glue
--task_name mrpc
--model_name_or_path bert-base-uncased
--model_type quant_bert
--learning_rate 2e-5
--output_dir /tmp/mrpc-8bit
--evaluate_during_training
--data_dir /path/to/MRPC
--do_lower_case
I get the following message:
INFO Weights of QuantizedBertForSequenceClassification not initialized from pretrained model: ['bert.embeddings.word_embeddings._step', 'bert.embeddings.position_embeddings._step', 'bert.embeddings.token_type_embeddings._step', 'bert.encoder.layer.0.attention.self.query._step', 'bert.encoder.layer.0.attention.self.query.input_thresh', 'bert.encoder.layer.0.attention.self.query.output_thresh', 'bert.encoder.layer.0.attention.self.key._step', 'bert.encoder.layer.0.attention.self.key.input_thresh', 'bert.encoder.layer.0.attention.self.key.output_thresh', 'bert.encoder.layer.0.attention.self.value._step', 'bert.encoder.layer.0.attention.self.value.input_thresh', 'bert.encoder.layer.0.attention.output.dense._step', 'bert.encoder.layer.0.attention.output.dense.input_thresh', 'bert.encoder.layer.0.intermediate.dense._step', 'bert.encoder.layer.0.intermediate.dense.input_thresh', 'bert.encoder.layer.0.output.dense._step', 'bert.encoder.layer.0.output.dense.input_thresh', 'bert.encoder.layer.1.attention.self.query._step',
... for all the layers. Can you please help figure out why all the weights are not initialized from the pretrained model? It works when I set model_type to bert instead of quant_bert.
Thanks a lot.