Skip to content

Latest commit

 

History

History
94 lines (64 loc) · 3.61 KB

README.md

File metadata and controls

94 lines (64 loc) · 3.61 KB

Neural Machine Translation implemented in PyTorch

This is a PyTorch implementation of Effective Approaches to Attention-based Neural Machine Translation using scheduled sampling to improve the parameter estimation process. It uses tab-delimited bilingual sentence pairs acquired from here to train predictive language models.

Implementation Architecture

The model is trained end-to-end using stacked RNNs for sequence encoding and decoding. The decoder is additionally conditioned on a context vector for predicting the next constituent token in the sequence. This vector is computed using an attention mechanism at each time step. Intuitively, the decoder is attempting to leverage information conglomerated by the encoder by deciding the relevancy of each encoding at each time step of the decoding process.

Results

Input Sequence (English) Output Sequence (Spanish)
how are you doing estas haciendo
i am going to the store voy a la tienda
she is a scientist ella es cientifico
he is an engineer el es un ingeniero
i am going out to the city voy al la de la ciudad
i am running out of ideas me estoy quedando sin ideas

Prerequisites

Usage

To train a new language model invoke train.py with the desired language abbreviation you would like to translate english to. For instance, english can be translated to by specifying 'afr' as input. 'afr.txt' in the data directory will be used. Other languages can be acquired from here. (default of input language is english.)

./train.sh

To translate an input sequence in english into another language, invoke eval.py and specify the desired language and sentence. The program will exit if the language model parameters are not found in the data directory or if the language prefix is mistyped.

./eval.sh

Files

  • attention.py

    Attention nn module that is responsible for computing the alignment scores.

  • attention_decoder.py

    Recurrent neural network that makes use of gated recurrent units to translate encoded inputs using attention.

  • encoder.py

    Recurrent neural network that encodes a given input sequence.

  • etl.py

    Helper functions for data extraction, transformation, and loading.

  • eval.py

    Script for evaluating the sequence-to-sequence model.

  • helpers.py

    General helper functions.

  • language.py

    Class that keeps record of some corpus. Attributes such as vocabulary counts and tokens are stored within instances of this class.

  • train.py

    Script for training a new sequence-to-sequence model.