This repository is an implementation of the Stereo Odometry based on careful Feature selection and Tracking, as a part of the course project for Probabilistic Mobile Robotics.
NOTE: Demo Video Run on KITTI Dataset City Sequence 01 is available here.
Pre-Requisites: Tested on PC with MATLAB R2014a installed
- Clone the repository using the following command:
git clone https://github.com/Mayankm96/Stereo-Odometry-SOFT.git
- Import the dataset to the folder
code/data
. In case you wish to use the KITTI Dataset, such as the Residential dataset, the following command might be useful:
cd PATH/TO/Stereo-Odometry-SOFT
## For Reseidential Sequence: 61 (2011_09_46_drive_0061)
# synced+rectified data
wget -c http://kitti.is.tue.mpg.de/kitti/raw_data/2011_09_26_drive_0061/2011_09_26_drive_0061_sync.zip -P code/data
# calib.txt
wget -c http://kitti.is.tue.mpg.de/kitti/raw_data/2011_09_26_calib.zip -P code/data
NOTE: The implementation here uses synced and rectified stereo images as inputs.
-
Change the corresponding paramters in the configuration directory
code/config
configFile1.m
: Set the path to dataset folders where images are storedconfigFile2.m
: Set the camera paramters from the calibration file downloaded
-
Run the script
code/src/main.m
to get a plot of the odometry estimated
In this section, we split the keypoint detection and matching pipeline into four separate stages:
- feature detection (extraction) stage: each image is searched for locations that are likely to match well in other images
- feature description stage: each region around detected keypoint locations in converted into a more compact and stable (invariant) descriptor that can be matched against other descriptors
- feature matching stage: efficiently searching for likely matching candidates in other images
- feature tracking stage: alternative to the third stage that only searches a small neighborhood around each detected feature and is therefore more suitable for video processing
This part of the algorithm is concerned with finding out the features for the egomotion estimation. It is based on the process used in the paper here.
This process can be broken down into following steps:
-
Extraction of corner- like features in the pair of images at instant t: Blob and corner masks used over the input image
-
Non- maximum and non-minimum suppression used on the filtered images, producing in feature candidates in either of the following classes: blob max, blob min, corner max, and corner min
-
Correspondences betwen corners found using Sum of Absolute Differences(SAD) over sparse set of pixels, that is given two feature points, we simply compare 11x11 block windows of horizontal and vertical Sobel filter responses to each other by using the sum of absolute differences (SAD) error metric. To speed-up matching, we quantize the Sobel responses to 8 bits and sum the differences over a sparse set of 16 locations instead of summing over the whole block window
-
The above step is susceptible to ouliers so circular matching is used to reject them.
In this we carefully select only the strongest features in the image by means of bucketing. Each bucket is a 50 pixels x 50 pixels part of the image. Bucketing helps in maintaing a uniform distribution of feature points across the image.
In above image, the crosses are all the features detected using minimum eigenvalue algorithm in our image. The red and green ones are the features that were selected after circular matching, and finally, the green ones are the features that have been selected through bucketing.
In our implementation we have used the Kanade-Lucas-Tomasi (KLT) algorithm to track the features in the left camera at time instant t.
We have used Nister's Five Point Algorithm in conjuction with RANSAC to find the best estimate of our rotation matric. The procedure used is similar to that useed to estimate structutre from motion using monocular vision.
We build a 3D point cloud using triangulation from the previous view (i.e. at time t-1) and reproject it onto the current image frame (i.e. at time t). The reprojection error function is then minimized with respect to translation, to estimate the translation vecto