To set-up an environment first install requirements with the following:
git clone https://github.com/wh629/CNLI-generalization.git
pip install -r jiant/requirements-dev.txt
Then install apex from:
https://github.com/NVIDIA/apex
You can use get_all_exp.sh
in run_scripts
to get commands for experiments. Commands for will appear in the newly created exp_scripts
directory as files named submit_exp_<training data>-<validation data>_<time stamp>.sh
.
For general use, you can get Python commands for experiments using:
sh get_all_exp.sh roberta-base none
Experiments are run on NYU's Prince computing cluster managed with Slurm. The following command can be used to generate commands to submit multiple jobs:
sh get_all_exp.sh roberta-base <absolute path to .sbatch file>
An example .sbatch
is provided in run_scripts
that requires updates to the <env name>
and <jiant path>
.
All scripts used to produce figures and tables can be found in the analysis_scripts
directory. Please refer to analysis.ipynb
for code used to compare run results and lexical-diversity.ipynb
for code used for n-gram counts.
@inproceedings{huang2020cnligeneralization,
title={Counterfactually-Augmented {SNLI} Training Data Does Not Yield Better Generalization Than Unaugmented Data},
author={William Huang and Haokun Liu and Samuel R. Bowman},
booktitle = {Proceedings of the 2020 EMNLP Workshop on Insights from Negative Results in NLP},
year={2020},
publisher = {The Association for Computational Linguistics}
}
Our code is released under the MIT License.