Code to train models from "Beyond BLEU: Training Neural Machine Translation with Semantic Similarity". Our code is based on the classic_seqlevel branch of Fairseq from Facebook AI Research.
To get started, follow the installation and setup instructions below.
If you use our code for your work please cite:
title={Beyond BLEU: Training Neural Machine Translation with Semantic Similarity},
author={Wieting, John and Berg-Kirkpatrick, Taylor and Gimpel, Kevin and Neubig, Graham},
booktitle={Proceedings of the Association for Computational Linguistics},
url = {},
Installation and setup instructions:
Install CUDA 8.0
Install Anaconda3 or Miniconda3
Download PyTorch 0.3.1:
Create a new environment and install requirements
conda create -n sim-mrt python=3.6 source activate sim-mrt pip install torch-0.3.1-cp36-cp36m-linux_x86_64.whl conda install tqdm conda install cffi conda install nltk pip install sacremoses pip install sentence-piece
Set environment variables:
export LD_LIBRARY_PATH=path/to/cuda8.0/cuda-8.0/lib64:$LD_LIBRARY_PATH export CPATH=path/to/cuda8.0/cuda-8.0/include
Install code
python build && python develop
Download and unzip data and semantic similarity models from
wget . unzip rm
To train baseline MLE models in language xx, choices are cs, de, ru, or tr:
python beyond_bleu/data/data-xx -a fconv_iwslt_de_en --lr 0.25 --clip-norm 0.1 --dropout 0.3 --max-tokens 1000 -s xx -t en --label-smoothing 0.1 --force-anneal 200 --save-dir checkpoints_xx --no-epoch-checkpoints
To train baseline minimum risk models with 1-sBLEU as a cost with alpha=0.3:
mkdir checkpoints_xx_0.3_word_0.0
cp beyond_bleu/checkpoints/checkpoints_xx/ checkpoints_xx_0.3_word_0.0/
python beyond_bleu/data/data-xx -a fconv_iwslt_de_en --clip-norm 0.1 --momentum 0.9 --lr 0.25 --label-smoothing 0.1 --dropout 0.3 --max-tokens 500 --seq-max-len-a 1.5 --seq-max-len-b 5 --seq-criterion SequenceRiskCriterion --seq-combined-loss-alpha 0.3 --force-anneal 11 --seq-beam 8 --save-dir checkpoints_xx_0.3_word_0.0 --seq-score-alpha 0 -s xx -t en --reset-epochs
To train baseline minimum risk models with 1-SimiLe as a cost with alpha=0.3:
mkdir checkpoints_xx_0.3_word_1.0
cp beyond_bleu/checkpoints/checkpoints_xx/ checkpoints_xx_0.3_word_1.0/
python beyond_bleu/data/data-xx -a fconv_iwslt_de_en --clip-norm 0.1 --momentum 0.9 --lr 0.25 --label-smoothing 0.1 --dropout 0.3 --max-tokens 500 --seq-max-len-a 1.5 --seq-max-len-b 5 --seq-criterion SequenceRiskCriterion --seq-combined-loss-alpha 0.3 --force-anneal 11 --seq-beam 8 --save-dir checkpoints_xx_0.3_word_1.0 --seq-score-alpha 1 -s xx -t en --sim-model-file beyond_bleu/sim/ --reset-epochs
To evaluate models in terms of corpus BLEU, SIM, and SimiLe:
python --data beyond_bleu/data/data-xx -s xx -t en --save-dir checkpoints_xx_0.3_word_1.0 --length_penalty 0.25 --sim-model-file beyond_bleu/sim/