🌺 Project Website | 📑 Paper
This repository contains the official PyTorch implementation of our ACML 2024 paper: Motion Meets Attention: Video Motion Prompts. We provide PyTorch code for training and testing our Video Motion Prompts (VMPs) layer using the TimeSformer model. Additionally, feel free to explore the real-time demo of our VMPs layer on the project website for a more intuitive understanding.
- Clone the repository:
git clone [email protected]:q1xiangchen/VMPs.git
cd VMPs
- Install the required dependencies below (refer to environment.yml or requirements.txt for more details):
- Basic dependencies:
- python >= 3.7
- pytorch >= 1.8.1
- Required packages:
- torchvision:
pip install torchvision
orconda install torchvision -c pytorch
- fvcore:
pip install 'git+https://github.com/facebookresearch/fvcore'
- simplejson:
pip install simplejson
- einops:
pip install einops
- timm:
pip install timm
- PyAV:
conda install av -c conda-forge
- psutil:
pip install psutil
- scikit-learn:
pip install scikit-learn
- OpenCV:
pip install opencv-python
- tensorboard:
pip install tensorboard
- wandb:
pip install wandb
- h5py:
pip install h5py
- torchvision:
- Lastly, build the TimeSformer codebase by running:
python setup.py build develop
Please use the dataset preparation instructions provided in DATASET.md.
To train and test the VMPs layer with TimeSformer model, please refer to the instructions in TRAIN.md.
Model | Pretrained dataset | Fine-tuned dataset | Link |
---|---|---|---|
TimeSformer | Kinetics-600 | - | Download |
TimeSformer (Baseline) | Kinetics-600 | MPII-Cooking-2 | Download |
VMPs + TimeSformer | Kinetics-600 | MPII-Cooking-2 | Download |
Our code is released under the MIT license. See LICENSE for more information. The portions of the TimeSformer codebase are released under CC-BY-NC 4.0 license. See LICENSE_TIMESFORMER for more information.
If you find VMPs useful in your research, please consider 📝 citing our paper or ⭐️ star our repo:
@inproceedings{
chen2024motion,
title={Motion meets Attention: Video Motion Prompts},
author={Qixiang Chen and Lei Wang and Piotr Koniusz and Tom Gedeon},
booktitle={The 16th Asian Conference on Machine Learning (Conference Track)},
year={2024},
url={https://openreview.net/forum?id=nIDAT99Vhb}
}
Qixiang Chen conducted this research under the supervision of Lei Wang for his final year honors research project at ANU. He is a recipient of research sponsorship from Space Zero Investments Pty Ltd in Perth, Western Australia, including The Active Intelligence Research Challenge Award. This work was also supported by the NCI Adapter Scheme Q4 2023, the NCI National AI Flagship Merit Allocation Scheme, and the National Computational Merit Allocation Scheme 2024 (NCMAS 2024), with computational resources provided by NCI Australia, an NCRIS-enabled capability supported by the Australian Government.
This codebase is built on top of TimeSformer of facebookresearch, and we thank the authors for their work.