Implementation of the paper DYNAMIC COATTENTION NETWORKS FOR QUESTION ANSWERING
in pytorch.
Several deep learning models have been proposed for question answering. How- ever, due to their single-pass nature, they have no way to recover from local max- ima corresponding to incorrect answers. To address this problem, we introduce the Dynamic Coattention Network (DCN) for question answering. The DCN first fuses co-dependent representations of the question and the document in order to focus on relevant parts of both. Then a dynamic pointing decoder iterates over po- tential answer spans. This iterative procedure enables the model to recover from initial local maxima corresponding to incorrect answers. On the Stanford question answering dataset, a single DCN model improves the previous state of the art from 71.0% F1 to 75.9%, while a DCN ensemble obtains 80.4% F1.
You may setup the repository on your local machine by either downloading it or running the following line on terminal
.
git clone https://github.com/h3lio5/dynamic-coattention-networks-pytorch.git
All dependencies required by this repository can be downloaded by creating a virtual environment with Python 3.7 and running
python3 -m venv .env
source .env/bin/activate
pip install -r requirements.txt
pip install -e .
All the data required to train and evaluate can be downloaded and preprocessed by running
python dcn/utils/preprocess.py
To train your own model from scratch, run
python train.py
- The parameters for your experiment are all set by defualt. But you are free to set them on your own by changing them in the
config.py
file.. - The training script will create a folder checkpoints as specified in your
config.py
file. - This folder will contain all model parameters saved after each epoch.
This repository contains the following files and folders
-
images: Contains media for
readme.md
. -
dcn/data_loader.py
: Contains helper functions that load data. -
generate.py
: Used to generate style transfered text from trained models. -
dcn/model.py
: Contains code to build the model. -
requirements.txt
: Lists dependencies for easy setup in virtual environments. -
train.py
: Contains code to train models from scratch. -
dcn/utils/preprocess.py
: Contains code to download and preprocess data. -
dcn/utils/vocab.py
: Contains code to generate vocabulary and word embeddings. -
dcn/config.py
: Contains information about various file paths and model configurations.
[ ] Model Evaluation