This is the offical repository for the paper HAHA@IberLEF2021: Humor Analysis using Ensembles of Simple Transformers by team Jocoso.
This paper describes the system submitted to the Humor Analysis based on Human Annotation (HAHA) task at IberLEF 2021. This system achieved the highest results in the main task of binary classification (Task 1) and was based on an ensemble of pre-trained multilingual BERT, pre-trained Spanish BERT i.e. BETO, a variation of BETO finetuned for sentiment analysis, RoBERTa, and a naive Bayes classifier. Our models achieve the winning F1 Score of 0.8850 in the Binary Classification task, the second place macro F1 Scores of 0.2916 and 0.3578 in Multi-class Classification and Multi-label Classification tasks respectively, and the third place RMSE score of 0.6295 in the Regression task.
Competition Details and Data: https://www.fing.edu.uy/inco/grupos/pln/haha/
The baseline provided by the organizers for this task uses Naive Bayes with TFIDF features for Binary Classification of tweets. It achieves an F1 score of 0.6619 over the testing corpus.In the final solution, we tried a series of ensembles of pre-trained models. The model used in the final solution is an ensemble of 5 models.
Ensembles Used | Ensemble ID |
---|---|
sBERT + mBERT + BETO + RoBERTa + NB | Jocoso[1] |
sBERT + mBERT + ALBERT + BETO + NB + RoBERTa | Jocoso[2] |
sBERT + mBERT + BETO + NB | Jocoso[3] |
sBERT + mBERT +ALBERT + BETO + NB | Jocoso[4] |
mBERT + BETO + sBERT + DeBERTa | Jocoso[5] |
mBERT + BETO + ALBERT + sBERT | Jocoso[6] |
Ensembles. | F1 | Precision | Recall | Accuracy |
---|---|---|---|---|
Jocoso[1] | 0.8850 | 0.9198 | 0.8526 | 0.8891 |
Jocoso[2] | 0.8826 | 0.9194 | 0.8486 | 0.8871 |
Jocoso[3] | 0.8822 | 0.9157 | 0.8509 | 0.8863 |
Jocoso[4] | 0.8791 | 0.9176 | 0.8436 | 0.8840 |
Jocoso[5] | 0.8777 | 0.9221 | 0.8373 | 0.8833 |
Jocoso[6] | 0.8758 | 0.9215 | 0.8343 | 0.8816 |
Second Place | 0.8716 | |||
Third Place | 0.8700 | |||
BETO | 0.8687 | 0.9044 | 0.8356 | 0.8736 |
mBERT | 0.8561 | 0.9137 | 0.8053 | 0.8646 |
Baseline | 0.6619 |
Similar to task 1, we use the Simple Transformers classification model, Classification Model for this task. However, unlike Task 1, we use it with a regression head i.e. we set the parameter regression = True. We used 6 pretrained models in our final solution:- Multilingual Base cased BERT (mBERT),ALBERT base v2, RoBERTa base, DistilBERT base cased, BETO and XLNet base cased model. All these models were finetuned on the training data for 2 epochs without any preprocessing.
Ensembles. | RMSE |
---|---|
First Place Solution | 0.6226 |
Second Place Solution | 0.6246 |
mBERT+ ALBERT + RoBERTa + DistilBERT + BETO + XLNet | 0.6295 |
BETO + mBERT + ALBERT | 0.6378 |
BETO + DistilBERT | 0.6397 |
BETO + ALBERT | 0.6391 |
BETO + XLNet | 0.6400 |
BETO + mBERT | 0.6412 |
Fourth Place Solution | 0.6587 |
Baseline | 0.6704 |
Our model, with a Macro F1 score of 0.2916, utilizes BETO to solve this problem of multi-class classification. We fine-tuned our model over the training corpus which comprises of approx 4800 tweets for this task.
Ensembles. | Macro F1 |
---|---|
First Place Solution | 0.3396 |
BETO - Cased | 0.2916 |
BETO - Cased + BETO - Uncased | 0.2636 |
Third Place Solution | 0.2522 |
Baseline | 0.1001 |
Our system comprises a pre-trained Spanish BETO cased model which is fine-tuned for 4 epochs on approximately 2000 tweets. Various ensembles and their results are listed in the above table.
Ensembles. | RMSE |
---|---|
First Place Solution | 0.4228 |
BETO - Cased, Not Preprocessed | 0.3578 |
BETO - Cased, Preprocessed | 0.3569 |
Third Place Solution | 0.3225 |
Baseline | 0.0527 |