Authors: Iury Cleveston, Esther L. Colombini
Paper: RAM-VO: Less is more in Visual Odometry
Thesis: RAM-VO: A Recurrent Attentional Model for Visual Odometry
Demo: Train 00, 02, 04, 05, 06, 08 | Test 03, 07, 10
Building vehicles capable of operating without human supervision requires the determination of the agent's pose. Visual Odometry (VO) algorithms estimate the egomotion using only visual changes from the input images. The most recent VO methods implement deep-learning techniques using convolutional neural networks (CNN) extensively, which add a substantial cost when dealing with high-resolution images. Furthermore, in VO tasks, more input data does not mean a better prediction; on the contrary, the architecture may filter out useless information. Therefore, the implementation of computationally efficient and lightweight architectures is essential. In this work, we propose the RAM-VO, an extension of the Recurrent Attention Model (RAM) for visual odometry tasks. RAM-VO improves the visual and temporal representation of information and implements the Proximal Policy Optimization (PPO) to learn robust policies. The results indicate that RAM-VO can perform regressions with six degrees of freedom from monocular input images using approximately 3 million parameters. In addition, experiments on the KITTI dataset demonstrate that RAM-VO achieves competitive results using only 5.7% of the available visual information.
The contributions of this work are:
- A lightweight VO method that selects the important input information via attentional mechanisms;
- The first visual odometry architecture that implements reinforcement learning in part of the pipeline;
- Several experiments on KITTI sequences demonstrating the validity and efficiency of RAM-VO.
RAM-VO requires Python 3.8. Cuda is not required, although recommended.
git clone
pip install -r requirements.txt
To train a new model:
python main.py
this command generates a folder <out_exec_folder>
inside out/
containing the model and data from training.
To test a trained model on a specific sequence:
python main.py --test <out_exec_folder> --dataset 'kitti' --test_seq <sequence>
To generate results, such as metrics, trajectories, and plots:
./gen_results.zsh <out_exec_folder>
There is no need to use these scripts directly, the gen_results.zsh
already call them.
To generate the trajectories and metrics (RPE, ATE):
python tools/gen_metrics.py --data_dir <out_exec_folder>
To plot the observations(glimpses):
python tools/plot_glimpse.py --dir <out_exec_folder> --epoch test
To plot the loss:
python tools/plot_loss.py --data_dir <out_exec_folder> --minibatch false
To plot the heatmap of observations:
python tools/plot_heatmap.py --dir <out_exec_folder> --glimpse <glimpse_number> --train false
To plot different trajectory predictions in the same figure:
python tools/plot_all_trajectories.py --data_dir <sequence>
To test the retina submodule:
python tools/test_retina.py
To extract the optical flow:
python tools/extract_optical_flow.py --seq <sequence> --method <method:(sparse|dense)>
If you use this code and data in your research, please cite our arxiv paper:
@article{cleveston2021ram,
title={RAM-VO: Less is more in Visual Odometry},
author={Cleveston, Iury and Colombini, Esther L},
journal={arXiv preprint arXiv:2107.02974},
year={2021}
}