Skip to content

vye16/shape-of-motion

Repository files navigation

Shape of Motion: 4D Reconstruction from a Single Video

Project Page | Arxiv

Qianqian Wang1,2*, Vickie Ye1*, Hang Gao1*, Weijia Zeng1*, Jake Austin1, Zhengqi Li2, Angjoo Kanazawa1

1UC Berkeley   2Google Research

* Equal Contribution

*New

We have preprocessed nvidia dataset and custom dataset which can be found here. We used MegaSaM to get cameras and depths for custom dataset.

Training

To train nvidia dataset

python run_training.py \
  --work-dir <OUTPUT_DIR> \
  data:nvidia \
  --data.data-dir </path/to/data>

To train custom dataset

python run_training.py \
  --work-dir <OUTPUT_DIR> \
  data:custom \
  --data.data-dir </path/to/data>

Train with 2D Gaussian Splatting

To get better scene geometry, we use 2D Gaussian Splatting:

python run_training.py \
  --work-dir <OUTPUT_DIR> \
  --use_2dgs
  data:custom \
  --data.data-dir </path/to/data>

Installation

git clone --recurse-submodules https://github.com/vye16/shape-of-motion
cd shape-of-motion/
conda create -n som python=3.10
conda activate som

Update requirements.txt with correct CUDA version for PyTorch and cuUML, i.e., replacing cu122 and cu12 with your CUDA version.


pip install -r requirements.txt
pip install git+https://github.com/nerfstudio-project/gsplat.git

Usage

Preprocessing

We depend on the third-party libraries in preproc to generate depth maps, object masks, camera estimates, and 2D tracks. Please follow the guide in the preprocessing README.

Evaluation on iPhone Dataset

First, download our processed iPhone dataset from this link. To train on a sequence, e.g., paper-windmill, run:

python run_training.py \
  --work-dir <OUTPUT_DIR> \
  --port <PORT> \
  data:iphone \
  --data.data-dir </path/to/paper-windmill/>

After optimization, the numerical result can be evaluated via:

PYTHONPATH='.' python scripts/evaluate_iphone.py \
  --data_dir </path/to/paper-windmill/> \
  --result_dir <OUTPUT_DIR> \
  --seq_names paper-windmill

Citation

@inproceedings{som2024,
  title     = {Shape of Motion: 4D Reconstruction from a Single Video},
  author    = {Wang, Qianqian and Ye, Vickie and Gao, Hang and Zeng, Weijia and Austin, Jake and Li, Zhengqi and Kanazawa, Angjoo},
  journal   = {arXiv preprint arXiv:2407.13764},
  year      = {2024}
}