We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi,
I'm wondering how to add ELECTRA and GPT2 support to this module.
Neither ELECTRA nor GPT2 has pooled output, unlike BERT/RoBERTa-based model.
I noticed in the models.py the model is implemented as following:
models.py
outputs = self.roberta( input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids, position_ids=position_ids, head_mask=head_mask, output_attentions=output_attentions, output_hidden_states=output_hidden_states ) pooled_output = outputs[1] seq_output = outputs[0] logits = self.output2logits(pooled_output, seq_output, input_ids) return self.calc_loss(logits, outputs, labels)
There are no pooled_output for ELECTRA/GPT2 sequence classification models, only seq_output is in the outputs variable.
pooled_output
seq_output
outputs
How to get around this limitation and get a working version of ELECTRA/GPT2? Thank you!
The text was updated successfully, but these errors were encountered:
I will look at these two models and get back to you when I add them or found a way to add them.
Sorry, something went wrong.
For ELECTRA, you can manually extract the pooled output representation of [CLS] (from HuggingFace https://huggingface.co/transformers/_modules/transformers/models/electra/modeling_electra.html#ElectraForSequenceClassification) as
discriminator_hidden_states = self.electra( input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, output_attentions, output_hidden_states, return_dict, ) seq_output = discriminator_hidden_states[0] pooled_output = seq_output[:, 0, :]
No branches or pull requests
Hi,
I'm wondering how to add ELECTRA and GPT2 support to this module.
Neither ELECTRA nor GPT2 has pooled output, unlike BERT/RoBERTa-based model.
I noticed in the
models.py
the model is implemented as following:There are no
pooled_output
for ELECTRA/GPT2 sequence classification models, onlyseq_output
is in theoutputs
variable.How to get around this limitation and get a working version of ELECTRA/GPT2?
Thank you!
The text was updated successfully, but these errors were encountered: