Final Project for CS1470 Deep Learning Spring 2024 @ Brown University
Anushka Narayanan, Colden Bobowick, Kaitlyn Williams
Link to MS-ASL Dataset used for project: https://www.microsoft.com/en-us/download/details.aspx?id=100121
References:
[1] Joao Carreira and Andrew Zisserman. Quo vadis, action recog- nition? a new model and the kinetics dataset. In proceedings of the IEEE Conference on Computer Vision and Pattern Recog- nition, pages 6299–6308, 2017.
[2] Daniel Maturana and Sebastian Scherer. Voxnet: A 3d convolu- tional neural network for real-time object recognition. In 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS), pages 922–928. IEEE, 2015.
[3] Du Tran, Heng Wang, Lorenzo Torresani, Jamie Ray, Yann LeCun, and Manohar Paluri. A closer look at spatiotemporal convolutions for action recognition. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 6450–6459, 2018.
[4] Hamid Vaezi Joze and Oscar Koller. Ms-asl: A large-scale data set and benchmark for understanding american sign language. In The British Machine Vision Conference (BMVC), September 2019.
Model architecture is adapted from Tensorflow Tutorial for Video Classification: https://www.tensorflow.org/tutorials/video/video_classification