Chih-Yao Ma*, Min-Hung Chen*
(* equal contribution)
We examine and implement several leading techniques for Activity Recognition (video classification), while proposing and investigating a novel convolution on temporally-constructed feature vectors.
CNN as baseline, CNN + RNN (LRCN), Temporal CNN
CNN as baseline | CNN + RNN (LRCN) | Temporal CNN |
The above YouTube video demonstrates the top-3 predictions results of our LRCN and temporal CNN model. The text on the top is the ground truth, three texts are the predictions for each of the method, and the bar right next to the predictions are how confident the model makes predictions.
We are currently using UCF101 dataset for our project. This dataset has 13320 videos from 101 action categories.
We will move onto Sports-1M dataset to see how much our performance will be changed in the near future.
Our work is currently implemented in Torch, and depends on the following packages: torch/torch7, torch/nn, torch/nngraph, torch/image, cudnn ...
If you are on Ubuntu, please follow the instruction here to install Torch. For a more comprehensive installation guilde, please check Torch installation.
$ git clone https://github.com/torch/distro.git ~/torch --recursive
$ cd ~/torch; bash install-deps;
$ ./install.sh
$ source ~/.bashrc
You will also need to install some of the packages we used from LuaRocks. LuaRocks should already be installed with your Torch.
$ luarocks install torch
$ luarocks install pl
$ luarocks install trepl
$ luarocks install image
$ luarocks install nn
$ luarocks install dok
$ luarocks install gnuplot
$ luarocks install qtlua
$ luarocks install sys
$ luarocks install xlua
$ luarocks install optim
If you would like to use CUDA on your NVIDIA graphic card, you will need to install CUDA toolkit and some additional packages.
$ luarocks install cutorch
$ luarocks install cunn
You need to install the CUDNN package properly since we use the pre-trained ResNet model. First, you need to download the package from Nvidia (You need to register to download it.)
Then, follow this instruction:
$ tar -xzvf cudnn-7.0-linux-x64-v4.0-prod.tgz
$ cd cuda
$ sudo cp lib* /usr/local/cuda/lib64/
$ sudo cp cudnn.h /usr/local/cuda/include/
$ luarocks install cudnn
(note: There may be problems if you use CUDNN v5 since currently Torch can only detect CUDNN v4.)
We provide three different methods to train the models for activity recognition: CNN, CNN with RNN, and Temporal CNN.
Our models will take the feature vectors generated by the first CNN as input for training. You can generate the features using our codes under "/CNN_Spatial/". You can also download the feature vectors generated by ourselves. (please refer to the Dropbox link below.) We followed the first training/testing split from UCF-101. If you would like to compare with our results, please use the same training and testing list, as it will affect your overall performance a lot.
We use the RNN library provided by Element-Research. Simply install it by:
$ luarocks install rnn
After you downloaded the feature vectors, please modify the code in ./RNN/data.lua to the director where you put your feature vector files.
To start the training process, go to ./RNN and simply execute:
$ th RNN_LSTM.lua
The training and testing performance will be plotted, and the results will be saved into log files. The learning rate and best testing accuracy will be reported each epoch if there is any update.
To start the training process, go to ./TCNN and simply execute:
$ qlua run.lua -r 15e-5
For more details, please refer to the readme file in the folder ./TCNN/.
You also need to modify the code in ./TCNN/data.lua to the director where you put your feature vector files.
The training and testing performance will be plotted, and the results will be saved into log files. The best testing accuracy will be reported each epoch if there is any update.
This work was initialized as a class project for deep learning class in Georgia Tech 2016 Spring. We were teamed up with Hao Yan and Casey Battaglino to work on this class project, who have been a great help and provide valuable discussions as we go long this class project.
Chih-Yao Ma at [email protected] or [LinkedIn]
Min-Hung Chen at [email protected]
Last updated: 05/05/2016