Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question regarding oversampling #9

Open
gurunathparasaram opened this issue Jun 7, 2019 · 2 comments
Open

Question regarding oversampling #9

gurunathparasaram opened this issue Jun 7, 2019 · 2 comments

Comments

@gurunathparasaram
Copy link

gurunathparasaram commented Jun 7, 2019

  • In the Low-resource paper, you have mentioned that NUCLE was over-sampled 10 times for domain-adaptation for CoNLL-14 dataset.

  • I tried benchmarking the pre-trained models provided in this repo on the WI+LOCNESS test-set.

  • Single model gave an F-score of 34.15 whereas the ensemble of 4 models+reranking gave an F-score of 53.27. The ensemble gives fewer false positives than the single model leading to higher precision.

  • Metrics of single model on WI+LOCNESS test-set
    lrgec_single

  • Metrics of ensemble on WI+LOCNESS test-set
    lrgec_ensemble

  • Does oversampling on NUCLE data lead to a decrease in precision for the single model from 69-70 on CONLL-14 test set to 31.3 on WI+LOCNESS test-set?

Thanks!

@snukky
Copy link
Contributor

snukky commented Jun 11, 2019

I don't think oversampling would downgrade the scores (but I haven't run these models on the BEA datasets yet). Such a low precision and high recall may suggest that there is something wrong with pre/postprocessing. How did you pre/postprocess the data? CoNLL uses NLTK and BEA uses Spacy for tokenization. Maybe they differ too much. Did you take a look at corrections made by the system?

Another thing might be that the weight for LM is too high. It was grid-searched on CoNLL 2013.

We don't use re-ranking in this system. We only ensemble with a language model.

@gurunathparasaram
Copy link
Author

gurunathparasaram commented Jun 13, 2019

I performed spell-correction using Jamspell on the BEA source sentences before giving them to the models. Will take a look into the system outputs soon and also try decreasing the weightage for LM. Sorry for the confusion, should have been ensemble+LM instead of reranking in my previous comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants