GitHub - critias/dl4mt_exercises: Lab exercises for the DL4MT winter school at DCU

An Introduction to Deep Learning with Theano

Fundamentals for DL4MT I

Quick review of MLP & Backprop
Autoencoders and Stacked Auto-Encoders

Goal: Prep and setup.
Compare logistic regression, MLP, and stacked auto-encoders on the same data
Challenges: see the bottom of each notebook

Day 1

Please work through the notebooks in the following order:

If you are new to theano, please start with the excellent tutorials from the Montreal Deep Learning Summer School 2015, and complete the following notebooks first:

intro_theano/intro_theano.ipynb
intro_theano/logistic_regression.ipynb

If you are already familiar with theano, please work through the notebooks in this repository in the following order:

theano_logistic_regression.ipynb
mlp.ipynb
theano_autoencoder.ipynb
stacked_autoencoder.ipynb

Day 2

Please work through the notebooks in the following order:

accmulating_rnn.ipynb
create_brown_w2v_index.ipynb
recurrent_transition_with_lookup.ipynb

Completing the labs

As you work through the notebooks, you will notice that some of the cells use the %%writefile magic at the top of the cell to write the content to a file. This is done for classes and functions which will be used in later notebooks. In order for the save paths to work correctly, you need to be running ipython notebook inside the directory for that day.

Every time the cell is run, the file will be overwritten, so if you want to modify the behavior of a class or function, just edit the cell where it is created, and the corresponding file will automatically update.

Installation and Setup

The notebooks need to be run inside the day*/ directory, so, for example:

git clone https://github.com/chrishokamp/dl4mt_exercises.git
cd dl4mt_exercises/notebooks/day1
ipython notebook

Please also make sure that you are using the bleeding edge version of theano from github. Installation instructions are here.

These labs use Fuel to build, load, and iterate over datasets, please install that first.

Memory and Disk Space Management

If you are using the Virtual Machine provided for the exercises, you may find the available memory getting low. Once you have worked through a notebook, you can close it to free up the RAM used by that kernel.

Dataset Description

You can see how our toy POS tagging dataset is created (and create your own versions) by looking at these notebooks:

notebooks/datasets/prep_pos_corpus.ipynb
notebooks/datasets/word_window_vectors.ipynb

Resources and Inspirations Used to Create these Tutorials

Most of the Theano code in these tutorials was taken from the excellent tutorials on deeplearning.net, and modified to be easy to use with an example Part-of-Speech tagging task using the Brown Corpus with the Universal Tagset POS tags.

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
datasets		datasets
dl4mt		dl4mt
notebooks		notebooks
trained_models		trained_models
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

An Introduction to Deep Learning with Theano

Fundamentals for DL4MT I

Day 1

Day 2

Completing the labs

Installation and Setup

Memory and Disk Space Management

Dataset Description

Resources and Inspirations Used to Create these Tutorials

About

Releases

Packages

critias/dl4mt_exercises

Folders and files

Latest commit

History

Repository files navigation

An Introduction to Deep Learning with Theano

Fundamentals for DL4MT I

Day 1

Day 2

Completing the labs

Installation and Setup

Memory and Disk Space Management

Dataset Description

Resources and Inspirations Used to Create these Tutorials

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages