Riku Murai* · Eric Dexheimer* · Andrew J. Davison
(* Equal Contribution)
Paper | Video | Project Page
conda create -n mast3r-slam python=3.11
conda activate mast3r-slam
Check the system's CUDA version with nvcc
nvcc --version
Install pytorch with matching CUDA version following:
# CUDA 11.8
conda install pytorch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 pytorch-cuda=11.8 -c pytorch -c nvidia
# CUDA 12.1
conda install pytorch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 pytorch-cuda=12.1 -c pytorch -c nvidia
# CUDA 12.4
conda install pytorch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 pytorch-cuda=12.4 -c pytorch -c nvidia
Clone the repo and install the dependencies.
git clone https://github.com/rmurai0610/MASt3R-SLAM.git --recursive
cd MASt3R-SLAM/
# if you've clone the repo without --recursive run
# git submodule update --init --recursive
pip install -e thirdparty/mast3r
pip install -e thirdparty/in3d
pip install --no-build-isolation -e .
Setup the checkpoints for MASt3R and retrieval. The license for the checkpoints and more information on the datasets used is written here.
mkdir -p checkpoints/
wget https://download.europe.naverlabs.com/ComputerVision/MASt3R/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric.pth -P checkpoints/
wget https://download.europe.naverlabs.com/ComputerVision/MASt3R/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric_retrieval_trainingfree.pth -P checkpoints/
wget https://download.europe.naverlabs.com/ComputerVision/MASt3R/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric_retrieval_codebook.pkl -P checkpoints/
bash ./scripts/download_tum.sh
python main.py --dataset datasets/tum/rgbd_dataset_freiburg1_room/ --config config/calib.yaml
Connect a realsense camera to the PC and run
python main.py --dataset realsense --config config/base.yaml
Our system can process either MP4 videos or folders containing RGB images.
python main.py --dataset <path/to/video>.mp4 --config config/base.yaml
python main.py --dataset <path/to/folder> --config config/base.yaml
If the calibration parameters are known, you can specify them in intrinsics.yaml
python main.py --dataset <path/to/video>.mp4 --config config/base.yaml --calib config/intrinsics.yaml
python main.py --dataset <path/to/folder> --config config/base.yaml --calib config/intrinsics.yaml
bash ./scripts/download_tum.sh
bash ./scripts/download_7_scenes.sh
bash ./scripts/download_euroc.sh
bash ./scripts/download_eth3d.sh
All evaluation script will run our system in a single-threaded, headless mode. We can run evaluations with/without calibration:
bash ./scripts/eval_tum.sh
bash ./scripts/eval_tum.sh --no-calib
bash ./scripts/eval_7_scenes.sh
bash ./scripts/eval_7_scenes.sh --no-calib
bash ./scripts/eval_euroc.sh
bash ./scripts/eval_euroc.sh --no-calib
bash ./scripts/eval_eth3d.sh
There might be minor differences between the released version and the results in the paper after developing this multi-processing version. We run all our experiments on an RTX 4090, and the performance may differ when running with a different GPU.
We sincerely thank the developers and contributors of the many open-source projects that our code is built upon.
If you found this code/work to be useful in your own research, please considering citing the following:
@article{murai2024_mast3rslam,
title={{MASt3R-SLAM}: Real-Time Dense {SLAM} with {3D} Reconstruction Priors},
author={Murai, Riku and Dexheimer, Eric and Davison, Andrew J.},
journal={arXiv preprint},
year={2024},
}