GitHub - paniabhisek/VGG: Very Deep Convolutional Networks for Large-Scale Image Recognition

VGG

This repo is an attempt to implement the paper

Very Deep Convolutional Networks for Large-Scale Image Recognition

K. Simonyan, A. Zisserman

ICLR 2015 (oral)

[arXiv (updated 10 Apr 2015)] [ILSVRC 2014 presentation] [Project page & ILSVRC ConvNet models]

in tensorflow. The initial data.py, utils.py, logs.py is taken from AlexNet.

Dataset

Olga Russakovsky*, Jia Deng*, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg and Li Fei-Fei. (* = equal contribution) ImageNet Large Scale Visual Recognition Challenge. arXiv:1409.0575, 2014. paper | bibtex

Dataset info:

Link: ILSVRC2010
Training size: 1261406 images
Validation size: 50000 images
Test size: 150000 images
Dataset size: 124 GB

To save up time:

I got one corrupted image (n02487347_1956.JPEG). The error read: Can not identify image file '/path/to/image/n02487347_1956.JPEG n02487347_1956.JPEG. This happened when I read the image using PIL. Before using this code, please make sure you can open n02487347_1956.JPEG using PIL. If not delete the image, you won't loose anything if you delete 1 image out of 1 million.

So I trained on 1261405 images using 8 GB GPU.

How to Run

To train: python model.py <path-to-training-data> --train true --test false
To test: python model.py <path-to-training-data> --train false --test true
screenlog-train.0: The log file after running python model.py <path-to-training-data> --train true in screen
model and logs: google drive

Preprocessing

The following preprocessing steps are performed

Rescaling: Isotropically rescale the image such that the smallest size is randomly drawn from [256, 512]. In short isotropically means the ratio of width to height of the original image should match with that of the new image.
Cropping: Randomly crop the image from the rescaled image to get a size of (224, 224).
Augmentation: Augment the data in two ways
1. Horizontally flip the image with 50 % probability
2. Add PCA as calculated by AlexNet to the processed image to give color shifting.
Subtract mean: Finally subtract the mean activity from the processed image.

Note: To calculate eigenvalues and eigenvectors for the imagenet dataset will require significant amount of RAM. So the values are taken from stackoverflow and hardcoded while adding PCA.

Tensorflow Generated Graphs

top1 accuracy:

top5 accuracy:

loss:

Accuracies

Top1 accuracy: 67.1013%

Top5 accuracy: 85.1460%

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
pictures		pictures
.gitignore		.gitignore
LICENSE		LICENSE
README.rst		README.rst
data.py		data.py
logs.py		logs.py
mean.pkl		mean.pkl
model.py		model.py
screenlog-train.0		screenlog-train.0
utils.py		utils.py
vgg19.json		vgg19.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VGG

Dataset

How to Run

Preprocessing

Tensorflow Generated Graphs

Accuracies

About

Releases

Packages

Languages

License

paniabhisek/VGG

Folders and files

Latest commit

History

Repository files navigation

VGG

Dataset

How to Run

Preprocessing

Tensorflow Generated Graphs

Accuracies

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages