The goal is to experiment using pretrained models in AllenNLP to make predictions on sentences labeled as gender-biased from biaslyAI. These predictions could be used to find more gender-biased sentences by observing patterns, or could be used as a first step in data augmentation.
This script takes sample sentences from biased_sentences.json
and uses AllenNLP's pretrained models (Semantic role labeling or Co-reference Resolution) to make predictions. It outputs the predictions in biased_sentences_srl.json
or biased_sentences_coref.json
respectively.
I installed AllenNLP using pip3:
pip3 install allennlp
But there are other options: https://github.com/allenai/allennlp#installation
Semantic Role Labeling (SRL) recovers the latent predicate argument structure of a sentence, providing representations that answer basic questions about sentence meaning, including “who” did “what” to “whom,” etc. The AllenNLP SRL model is a reimplementation of a deep BiLSTM model (He et al, 2017), which is currently state of the art for PropBank SRL (Newswire sentences). Source
To run script with SRL model
python3 allennlp_models.py \
https://s3-us-west-2.amazonaws.com/allennlp/models/srl-model-2018.02.27.tar.gz \
biased_sentences.json --output-file biased_sentences_srl.json
Coreference resolution is the task of finding all expressions that refer to the same entity in a text. End-to-end Neural Coreference Resolution (Lee et al, 2017) is a neural model which considers all possible spans in the document as potential mentions and learns distributions over possible anteceedents for each span, using aggressive, learnt pruning strategies to retain computational efficiency. It achieved state-of-the-art accuracies on the Ontonotes 5.0 dataset in early 2017. Source
To run script with Co-reference model
python3 allennlp_models.py \
https://s3-us-west-2.amazonaws.com/allennlp/models/coref-model-2018.02.05.tar.gz \
biased_sentences.json --output-file biased_sentences_coref.json
To better visualize the output predictions, you can run a Flask server that will serve predictions from a single AllenNLP model. It also includes a very, very bare-bones web front-end for exploring predictions.
To visualize the output for Semantic Role Labeling model, run the below command and navigate to localhost:8000
python3 -m allennlp.service.server_simple \
--archive-path https://s3-us-west-2.amazonaws.com/allennlp/models/srl-model-2018.02.27.tar.gz \
--predictor semantic-role-labeling \
--title "AllenNLP Semantic Role Labeling on biaslyAI Sentences" \
--field-name sentence
To visualize the output for Co-reference Resolution model, run the below command and navigate to localhost:8000
python3 -m allennlp.service.server_simple \
--archive-path https://s3-us-west-2.amazonaws.com/allennlp/models/coref-model-2018.02.05.tar.gz \
--predictor coreference-resolution \
--title "AllenNLP Co-reference on biaslyAI Sentences" \
--field-name document
For a much better front-end visual of the models, check out AllenNLP's demos:
Semantic Role Labeling: http://demo.allennlp.org/semantic-role-labeling
Co-reference Resolution: http://demo.allennlp.org/coreference-resolution