This is the project repo for USC CS561 HW3 MNIST project. Implement a pure numpy neural network for hand written digit recognition task.
Training and testing shared a limit on total running time within 30 minutes, GPU acceleration and multiprocessing not allowed.
10000 train and 10000 test images selected from MNIST dataset. Extract by running
Split training and validation sets on a ratio of 9:1
Augment training data by scale, rotation and translation
Shuffle and create batch
- Structure: 4-layer MLP
- Activation: Tanh
- Loss func: Cross entropy loss
- Optimizer: Adam
- lr: 1e-4
- batch size: 128
- epoch: 20 (with early termination)
- beta1, beta2: 0.9, 0.999
- weight decay: None
- cross validation: None
Accuracy scoring on hidden train & test set on Vocareum: 0.9723