Video Motion Prompts

This repository contains the official PyTorch implementation of our ACML 2024 paper: Motion Meets Attention: Video Motion Prompts. We provide PyTorch code for training and testing our Video Motion Prompts (VMPs) layer using the TimeSformer model. Additionally, feel free to explore the real-time demo of our VMPs layer on the project website for a more intuitive understanding.

🌹 Getting Started

🪛 Installation

Clone the repository:

git clone [email protected]:q1xiangchen/VMPs.git
cd VMPs

Install the required dependencies below (refer to environment.yml or requirements.txt for more details):

Basic dependencies:
- python >= 3.7
- pytorch >= 1.8.1
Required packages:
- torchvision: pip install torchvision or conda install torchvision -c pytorch
- fvcore: pip install 'git+https://github.com/facebookresearch/fvcore'
- simplejson: pip install simplejson
- einops: pip install einops
- timm: pip install timm
- PyAV: conda install av -c conda-forge
- psutil: pip install psutil
- scikit-learn: pip install scikit-learn
- OpenCV: pip install opencv-python
- tensorboard: pip install tensorboard
- wandb: pip install wandb
- h5py: pip install h5py

Lastly, build the TimeSformer codebase by running:

python setup.py build develop

🪴 Dataset Preparation

Please use the dataset preparation instructions provided in DATASET.md.

🌷 Training and Testing

To train and test the VMPs layer with TimeSformer model, please refer to the instructions in TRAIN.md.

🍀 Model Weights

Model	Pretrained dataset	Fine-tuned dataset	Link
TimeSformer	Kinetics-600	-	Download
TimeSformer (Baseline)	Kinetics-600	MPII-Cooking-2	Download
VMPs + TimeSformer	Kinetics-600	MPII-Cooking-2	Download

🌼 License

Our code is released under the MIT license. See LICENSE for more information. The portions of the TimeSformer codebase are released under CC-BY-NC 4.0 license. See LICENSE_TIMESFORMER for more information.

🌸 Citation

If you find VMPs useful in your research, please consider 📝 citing our paper or ⭐️ star our repo:

@inproceedings{
chen2024motion,
title={Motion meets Attention: Video Motion Prompts},
author={Qixiang Chen and Lei Wang and Piotr Koniusz and Tom Gedeon},
booktitle={The 16th Asian Conference on Machine Learning (Conference Track)},
year={2024},
url={https://openreview.net/forum?id=nIDAT99Vhb}
}

🎄 Acknowledgment

Qixiang Chen conducted this research under the supervision of Lei Wang for his final year honors research project at ANU. He is a recipient of research sponsorship from Space Zero Investments Pty Ltd in Perth, Western Australia, including The Active Intelligence Research Challenge Award. This work was also supported by the NCI Adapter Scheme Q4 2023, the NCI National AI Flagship Merit Allocation Scheme, and the National Computational Merit Allocation Scheme 2024 (NCMAS 2024), with computational resources provided by NCI Australia, an NCRIS-enabled capability supported by the Australian Government.

This codebase is built on top of TimeSformer of facebookresearch, and we thank the authors for their work.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
configs		configs
fig		fig
timesformer		timesformer
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video Motion Prompts

🌹 Getting Started

🪛 Installation

🪴 Dataset Preparation

🌷 Training and Testing

🍀 Model Weights

🌼 License

🌸 Citation

🎄 Acknowledgment

About

Releases

Packages

Languages

License

q1xiangchen/VMPs

Folders and files

Latest commit

History

Repository files navigation

Video Motion Prompts

🌹 Getting Started

🪛 Installation

🪴 Dataset Preparation

🌷 Training and Testing

🍀 Model Weights

🌼 License

🌸 Citation

🎄 Acknowledgment

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages