We perform semantic segmentation using FCN8s and SegNet on the Indian Driving Dataset and compare the performance of the two models based on accuracy and IOU score.
The Indian Driving Dataset consists of 6906 and 979 high resolution images in the training and validation set. There are a total of 39 unique class labels. We work with a subset consisting of 1000 training and 100 test samples for our project.
For each image, we have a json file containing the number of different classes in the image and the polygon vertices for segmentation map of each class. We simplify this directory structure as follows:
- img
- train
- val
- seg
- train
- val
We create segmentation maps as .png files and store them in the seg directory. Refer to Preprocessing.ipynb to understand how the preprocessing is done.
We refer to the implementaion by zijundeng for FCN8s and SegNet architectures. We use a pretrained VGG16 and pretrained VGG19 with batch normalization layers as a feature extractor for FCN8s and SegNet respectively.
The weights for the trained models are available here.
The following summarises the accuracy and IOU for the trained networks on the validation set:
Model | Accuracy | IOU |
---|---|---|
FCN8s | 74.46 | 60.23 |
SegNet | 79.19 | 63.44 |
The following are the visualizations of the output for the FCN8s (from left to right- input, ground truth, model output):
The following are the visualizations of the output for the SegNet (from left to right- input, ground truth, model output):
We would like to thank Dr. Saket Anand for providing us with the Indian Driving Dataset for this project. We would also like to thank Zijun Deng for making their repository on semantic segmentation publicly available which we have referred to for FCN and SegNet architectures.