Skip to content

Removing noise and irregularities from MNIST digits in Tensorflow.

Notifications You must be signed in to change notification settings

RobbieHolland/MNIST-Autoencoder

Repository files navigation

MNIST-Autoencoder

The goal was to remove noise and irregularities from MNIST digits using Tensorflow (reproducing results originally obtained in 2006). Below are two digits called A and B.

Original (Digit A): Corrupted (Digit A):

Original (Digit B): Corrupted (Digit B):

The network (with model layers of size [768, 400, n, 400, 768] where n is size of the encoded layer) then attempts to reconstruct the original from the corrupted version:

Code Layer Size (n) Asymptotic Error (20 epochs) Reconstructed Digit A Reconstructed Digit B
10 2.2e5
20 1.5e5
30 1.0e5

It is evident that a coding layer of size 10 is insufficient to reconstruct the original image. The '4' also resembling a '9' and the '5' a '6'.

A coding layer of size 20 accurately reconstructs the digits and removes irregularities, such as the swish on the tail of the original '4'.

A coding layer of size 30 also accurately reconstructs the digits but starts to retain irrelevant information such as the tail of the original '4'.

Conclusion


At least for this network model of [768, 400, n, 400, 768] the best found n was around 20, where 'best' is defined as a balance between accurately reconstructing the image without retaining irrelevant features of the original.

Afterthoughts


I have since been informed that using Sigmoid for activation is a bit outdated and that ReLU provides sufficient non-linearities and trains faster.

Pre-training weights (G. E. Hinton, R. R. Salakhutdinov - Reducing the Dimensionality of Data with Neural Networks, 2006) rather than mirroring the initial weights of the encoder and decoders may allow for smaller coding layers when training in 20 epochs.

About

Removing noise and irregularities from MNIST digits in Tensorflow.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages