Qianqian Wang1,2*, Vickie Ye1*, Hang Gao1*, Weijia Zeng1*, Jake Austin1, Zhengqi Li2, Angjoo Kanazawa1
1UC Berkeley 2Google Research
* Equal Contribution
We have preprocessed nvidia dataset and custom dataset which can be found here. We used MegaSaM to get cameras and depths for custom dataset.
To train nvidia dataset
python run_training.py \
--work-dir <OUTPUT_DIR> \
data:nvidia \
--data.data-dir </path/to/data>
To train custom dataset
python run_training.py \
--work-dir <OUTPUT_DIR> \
data:custom \
--data.data-dir </path/to/data>
To get better scene geometry, we use 2D Gaussian Splatting:
python run_training.py \
--work-dir <OUTPUT_DIR> \
--use_2dgs
data:custom \
--data.data-dir </path/to/data>
git clone --recurse-submodules https://github.com/vye16/shape-of-motion
cd shape-of-motion/
conda create -n som python=3.10
conda activate som
Update requirements.txt
with correct CUDA version for PyTorch and cuUML,
i.e., replacing cu122
and cu12
with your CUDA version.
pip install -r requirements.txt
pip install git+https://github.com/nerfstudio-project/gsplat.git
We depend on the third-party libraries in preproc
to generate depth maps, object masks, camera estimates, and 2D tracks.
Please follow the guide in the preprocessing README.
First, download our processed iPhone dataset from this link. To train on a sequence, e.g., paper-windmill, run:
python run_training.py \
--work-dir <OUTPUT_DIR> \
--port <PORT> \
data:iphone \
--data.data-dir </path/to/paper-windmill/>
After optimization, the numerical result can be evaluated via:
PYTHONPATH='.' python scripts/evaluate_iphone.py \
--data_dir </path/to/paper-windmill/> \
--result_dir <OUTPUT_DIR> \
--seq_names paper-windmill
@inproceedings{som2024,
title = {Shape of Motion: 4D Reconstruction from a Single Video},
author = {Wang, Qianqian and Ye, Vickie and Gao, Hang and Zeng, Weijia and Austin, Jake and Li, Zhengqi and Kanazawa, Angjoo},
journal = {arXiv preprint arXiv:2407.13764},
year = {2024}
}