Tools and Data
Approach | LM Perplexity | Classifier F1 |
---|---|---|
BERT | 8.2 | 0.63 |
DistilBERT | 6.5 | 0.63 |
ULMFIT | 21 | 0.61 |
RoBERTa | 7.54 | 0.64 |
Base LM | Dataset | Accuracy | Precision | Recall | F1 | LM Perplexity |
---|---|---|---|---|---|---|
bert-base-multilingual-cased | Test | 0.688 | 0.698 | 0.686 | 0.687 | 8.2 |
bert-base-multilingual-cased | Valid | 0.62 | 0.592 | 0.605 | 0.55 | 8.2 |
distilbert-base-uncased | Test | 0.693 | 0.694 | 0.703 | 0.698 | 6.51 |
distilbert-base-uncased | Valid | 0.607 | 0.614 | 0.600 | 0.592 | 6.51 |
distilbert-base-multilingual-cased | Test | 0.612 | 0.615 | 0.616 | 0.616 | 8.1 |
distilbert-base-multilingual-cased | Valid | 0.55 | 0.531 | 0.537 | 0.495 | 8.1 |
roberta-base | Test | 0.630 | 0.629 | 0.644 | 0.635 | 7.54 |
roberta-base | Valid | 0.60 | 0.617 | 0.607 | 0.595 | 7.54 |
Ensemble | Test | 0.714 | 0.718 | 0.718 | 0.718 |
Model | Accuracy | Precision | Recall | F1 | Config | Link to Model and output files |
---|---|---|---|---|---|---|
BERT | 0.68866 | 0.69821 | 0.68608 | 0.6875 | Batch Size - 16 Attention Dropout - 0.4 Learning Rate - 5e-07 Adam epsilon - 1e-08 Hidden Dropout Probability - 0.3 Epochs - 3 |
BERT |
DistilBert | 0.69333 | 0.69496 | 0.70379 | 0.6982 | Batch Size - 16 Attention Dropout - 0.6 Learning Rate - 3e-05 Adam epsilon - 1e-08 Hidden Dropout Probability - 0.6 Epochs - 3 |
DistilBert |
EnsembleBert1 | 0.69233 | 0.70236 | 0.69064 | 0.68952 | Batch Size - 4 Attention Dropout - 0.7 Learning Rate - 5.01e-05 Adam epsilon - 4.79e-05 Hidden Dropout Probability - 0.1 Epochs - 3 |
EnsembleBert1 |
EnsembleBert2 | 0.691 | 0.7009 | 0.6889 | 0.68872 | Batch Size - 4 Attention Dropout - 0.6 Learning Rate - 5.13e-05 Adam epsilon - 9.72e-05 Hidden Dropout Probability - 0.2 Epochs - 3 |
EnsembleBert2 |
EnsembleDistilBert1 | 0.70166 | 0.70377 | 0.70976 | 0.7061 | Batch Size - 16 Attention Dropout - 0.8 Learning Rate - 3.02e-05 Adam epsilon - 9.35e-05 Hidden Dropout Probability - 0.4 Epochs - 3 |
EnsembleDistilBert1 |
EnsembleDistilBert2 | 0.689 | 0.691 | 0.69666 | 0.69335 | Batch Size - 4 Attention Dropout - 0.6 Learning Rate - 5.13e-05 Adam epsilon - 9.72e-05 Hidden Dropout Probability - 0.2 Epochs - 3 |
EnsembleDistilBert2 |
EnsembleDistilBert3 | 0.69366 | 0.69538 | 0.70557 | 0.69905 | Batch Size - 16 Attention Dropout - 0.4 Learning Rate - 4.74e-05 Adam epsilon - 4.09e-05 Hidden Dropout Probability - 0.6 Epochs - 3 |
EnsembleDistilBert3 |
Ensemble | 0.71466 | 0.71867 | 0.71853 | 0.7182 | NA | Ensemble |