This is an unofficial C++ implementation of the ECCV 2018 paper: Deep virtual stereo odometry: Leveraging deep depth prediction for monocular direct sparse odometry (DVSO).
This implementation is intended for the use of the virtual stereo optimization in our ICRA 2022 paper: Towards Scale Consistent Monocular Visual Odometry by Learning from the Virtual World.
If you find this project useful in your research, please consider citing the following papers:
@article{engel2017direct,
title={Direct sparse odometry},
author={Engel, Jakob and Koltun, Vladlen and Cremers, Daniel},
journal={IEEE transactions on pattern analysis and machine intelligence},
volume={40},
number={3},
pages={611--625},
year={2017},
publisher={IEEE}
}
@inproceedings{yang2018deep,
title={Deep virtual stereo odometry: Leveraging deep depth prediction for monocular direct sparse odometry},
author={Yang, Nan and Wang, Rui and Stuckler, Jorg and Cremers, Daniel},
booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
pages={817--833},
year={2018}
}
@article{zhang2022towards,
title={Towards Scale Consistent Monocular Visual Odometry by Learning from the Virtual World},
author={Zhang, Sen and Zhang, Jing and Tao, Dacheng},
journal={arXiv preprint arXiv:2203.05712},
year={2022}
}
This project is built upon the DSO codebase. The borrowed codes are licensed under the original license of DSO. Special thanks to this great work by J. Engel, V. Koltun, and D. Cremers. For more information, see https://vision.in.tum.de/dso.
- Please follow the instructions in DSO to install necessary packages
- Important change for Pangolin installation
- The most recent version of Pangolin is not compatible anymore!!!
- Please use the older version of Pangolin:
git reset --hard 86eb4975fc
- Important change for Pangolin installation
- Build
cd dvso mkdir build cd build cmake .. make -j4
You can run DVSO using the command:
bin/dvso_dataset \
files=XXXXX/seq_XX/images \
calib=XXXXX/seq_XX/camera.txt \
disps_left=XXXXX/seq_XX/disparities_pp_left.npy \
disps_right=XXXXX/seq_XX/disparities_pp_right.npy \
mode=1
Please refer to DSO for the usage of mode
, files
, and calib
.
disps_left
: path to the npy file of the disparities of the left images. The left disparities are used for depth initialization.disps_right
: path to the npy file of the disparities of the right images. The right disparities are used for the virtual stereo optimization.
- For monodepth that trains both the left and the right disparity networks, please check
scripts/monodepth_dso_kitti.py
for getting thedisp_left/right.npy
files - For monodepth2 that only trains the left disparity network, please (1) generate the left disparity
.npy
file using the scriptevaluate_depth.py
in monodepth2, and then (2) checkscripts/warp_right_disp.py
for getting the forward warped right disparity.npy
file
We follow the routine of DSO, thus the options in DSO can still be used here. The options related to the virtual stereo functionality are listed below:
You may need to modify wStereo
, scaleEnergyLeftTHR
, and scaleWJI2SumTHR
on your own dataset:
-
wStereo=1
(by default): the weight for the virtual stereo residue -
scaleEnergyLeftTHR=2
(by default): a threshold used for determining the outlier status. The smaller this value is, the more outliers are allowed. -
scaleWJI2SumTHR=2
(by default): a threshold used for determining the outlier status. The larger this value is, the more outliers are allowed.
The following options are used primarily for debugging and developing. We suggest keep the defaults values:
-
judgeHW=1
(by default): only take the square root of the huber energy if its value is smaller than 1. You can disable this functionality by settingjudgeHW=0
-
maskWarpGrad=1
(by default): set the gradient of warped holes to be 0. You can disable this functionality by settingmaskWarpGrad=0
-
checkWarpValid=1
(by default): check whether a point warping is valid or not before adding the virtual stereo residue. You can disable this functionality by settingcheckWarpValid=0
-
useVS=1
(by default): use the virtual stereo residue. You can disable this functionality by settinguseVS=0
-
wStereoPosFlag=Before
(by default): where to multiply wStereo to the huber energy, before we take the square root of the huber energy or after. You can optionally usewStereoPosFlag=After
-
wCorrectedFlag=Ori
(by default): use the original huber loss implementation in DSO, which is a little bit different from the definition of huber. You can optionally usewCorrectedFlag=Corr
to try the implementation based on the definition of huber. -
wGradFlag=Hit
(by default): use the original implementation of pixel weighting in DSO. You can optionally usewGradFlag=Grad
to update the image gradients used in pixel weighting to the counterparts that further incorporate the disparity map gradients
- format_kitti.py: used for transforming DVSO / DSO results (with ids rather than timestamps) to the KITTI format for evaluation
- You may need to rename the DSO result file to result_seq_suf.txt before running this script, e.g.,
mv result.txt result_$seq.$suf.txt
- You may need to change some absolute path inside this script to your own path
- This script will generate reformated results that are compatible with evo and Zhan's evaluation code. Please follow their instructions on evaluating the results.
- You may need to rename the DSO result file to result_seq_suf.txt before running this script, e.g.,
We do not re-implement the depth networks proposed in the original DVSO paper. Instead, we just use the disparities predicted by monodepth, which can already achieve quite satisfactory results: