This is an implementation of piecewise crf training for semantic segmentation based on the work of Chen et al. The implemented model consists of three parts:
- A neural network used for learning unary and binary potentials
- A contextual conditional random field that combines the learnt unary and binary potentials
- A fully connected Gaussian conditional random field used for segmentation postprocessing
The implemented system is evaluated on the publicly available datasets: Cityscapes and KITTI. For more information about the implementation as well as the results look into the thesis paper.
In this section the usage pipeline for semantic segmentation is explained. For more detailed usage explanations about specific scripts look into the comments inside them or the readme files in appropriate subdirectories of this project. The usage pipeline consists of several steps which will be further explained in the upcoming sections. All the scripts are well documented and for information about script arguments look into comments.
IMPORTANT: In order to run the piecewisecrf scripts, set the PYTHONPATH environment variable to the project(repository) path.
The first step is to generate all the necessary files used for training and validation.
- Download the datasets (Cityscapes or KITTI). For the Cityscapes dataset download the ground truth labels as well as left images. Extract the downloaded archives. For KITTI rename the valid folder to val.
- Run the
piecewisecrf/datasets/cityscapes/train_validation_split.py
in order to generate the validation dataset. For KITTI usepiecewisecrf/datasets/kitti/train_validation_split.py
. - Configure
piecewisecrf/config/prefs.py
file. Set thedataset_dir, save_dir, img_width, img_height, img_depth
flags - Run the
piecewisecrf/datasets/cityscapes/prepare_dataset_files.py
in order to generate files necessary for tensorflow records generation as well as evaluation. For KITTI usepiecewisecrf/datasets/kitti/prepare_dataset_files.py
. - Generate tensorflow records used for training and validation by running the following script
piecewisecrf/datasets/prepare_tfrecords.py
. The destination directory is used to reconfigure thepiecewisecrf/config/prefs.py
file (train_records_dir, val_records_dir, test_records_dir
flags)
- Prepare the numpy file with vgg weights (look at the readme in caffe-tensorflow).
- Configure the
piecewisecrf/config/prefs.py
(vgg_init_file, train_dir
and all the other parameters for training) - Run
piecewisecrf/train.py
- Configure the
piecewisecrf/config/prefs.py
if not already done. - Run the following script:
piecewisecrf/eval.py
- Configure the
piecewisecrf/config/prefs.py
if not already done. - Run the following script:
piecewisecrf/forward_pass.py
This will generate predictions (in small and original resolution) as well as unary potentials used by the fully connected CRF.
This is done by applying grid search.
- Build the dense crf executable (look at the readme in densecrf)
- If necessary pick a subset of the validation dataset by using
tools/validation_set_picker.py
andtools/copy_files.py
- Configure the
tools/grid_config.py
file (grid search parameters) - Start the grid search by running
tools/grid_search.py
. - Evaluate grid search results by running
tools/evaluate_grid.py
With this you will get optimal CRF parameters on the validation dataset.
- To infer images with the fully connected CRF run the
tools/run_crf.py
script. - In order to evaluate the generated output you can use
tools/calculate_accuracy_t.py
- Because the output is in binary format, in order to generate image files, run the
tools/colorize.py
script.
Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation
Guosheng Lin, Chunhua Shen, Anton van den Hengel, Ian Reid
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016
Convolutional scale invariance for semantic segmentation
Krešo Ivan, Čaušević Denis, Krapac Josip, Šegvić Siniša
38th German Conference on Pattern Recognition, Hannover, 2016
Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials
Philipp Krähenbühl and Vladlen Koltun
NIPS 2011
Vision-based offline-online perception paradigm for autonomous driving
Ros, G., Ramos, S., Granados, M., Bakhtiary, A., Vazquez, D., Lopez, A.M.
IEEE Winter Conference on Applications of Computer Vision, Hawaii, 2015
The Cityscapes dataset for semantic urban scene understanding
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, SS., Schiele, B.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 2016