Skip to content

Training a toy convolutional neural network (CNN) model to predict semantic segmentation masks of street-level images

Notifications You must be signed in to change notification settings

smohiudd/street-level-semantic-segmentation

Repository files navigation

Segmenting Street-Level Images with Computer Vision using Tensorflow

Read the full blog post here: https://saadiqm.com/2019/03/06/computer-vision-streets.html

Code adapted from: https://github.com/tensorflow/models/blob/master/samples/outreach/blogs/segmentation_blogpost/image_segmentation.ipynb

For this exercise, I’m working with 100 Google Street View images divided into 80 images for training and 20 images for test. Using so few images will not produce a performant model, but this exercise was mainly to familiarize myself with the general CNN training workflow as well as Tensorflow’s data pipeline.

This post is divided into the following sections:

  • Image Labelling (Ground Truth)
  • Creating Image Label Masks
  • Input data/image pipeline & creating TFRecords
  • Building the Model
  • Training the Model
  • Prediction

The final loss/mIOU charts seem somewhat reasonable for a toy example given that we only have 80 training samples and 20 test samples. Of course we are not expecting to see high performance results with such a small dataset despite some data augmentation.

'loss/mIOU metric graph'

Finally, lets predict the output masks given some sample images. The output looks acceptable for images with few classes but fails when predicting many classes and complex representations.

'prediction image'

'prediction image'

'prediction image'

'prediction image'

About

Training a toy convolutional neural network (CNN) model to predict semantic segmentation masks of street-level images

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published