Skip to content

Latest commit

 

History

History
165 lines (114 loc) · 8.03 KB

File metadata and controls

165 lines (114 loc) · 8.03 KB

Traffic Sign Recognition with artificial neural networks

Writeup by Hannes Bergler


The goals / steps of this project were the following:

  • Load the dataset (see below for links to the project dataset)
  • Explore, summarize and visualize the dataset
  • Design, train and test a model architecture
  • Use the model to make predictions on new images
  • Analyze the softmax probabilities of the new images
  • Summarize the results with a written report

Rubric Points

Here I will consider the rubric points individually and describe how I addressed each point in my implementation.


I. Submission Files

The submission includes a writeup, which you're reading right now!

And here is a link to my project code.

II. Dataset Summary & Exploration

1. Basic summary of the dataset.

Summary statistics of the traffic signs dataset:

  • The size of training set is 34799
  • The size of the validation set is 4410
  • The size of test set is 12630
  • The shape of a traffic sign image is (32, 32, 3)
  • The number of unique classes/labels in the dataset is 43

2. Exploratory visualization of the dataset.

Here is a bar chart showing how the traffic sign classes are distributed in the training dataset. You can see that some classes (e.g. 1 and 2) are much more common than others (e.g. 0 and 19).

histogram

For visualizing the dataset, I also printed out one image of each traffic sign class in the jupyter notebook.

III. Design and Test a Model Architecture

1. Preprocessing the image data.

For preprocessing the data I used the following steps:

  • shuffle the training data, to get a random order of the images
  • normalization of the image data: [0 .. 255] --> [0.1 .. 0.9]

I did NOT convert the images to grayscale to not lose the color information. I found that with my setup the prediction accuracy of the validation dataset dropped by 0.01, when converting the images to grayscale.

2. Final model architecture.

I used the LeNet architecture as a starting point, which works very well on 32x32 images. To adapt it to the colored images and the larger number of output classes, I doubled the size of every network layer.

My final model consisted of the following layers:

Layer Description
Input 32x32x3 RGB image
Convolution 5x5 2x2 stride, VALID padding, outputs 28x28x12
RELU simple activation function
Max pooling 2x2 stride, outputs 14x14x12
Convolution 5x5 2x2 stride, VALID padding, outputs 10x10x32
RELU simple activation function
Max pooling 2x2 stride, outputs 5x5x32
Flatten change output shape from 5x5x32 to 800
Fully connected with dropout output: 240
RELU simple activation function
Fully connected with dropout output: 168
RELU simple activation function
Output layer output: 43

3. Training the model.

For training the model, I used the following parameters:

  • optimizer: AdamOptimizer (tensorflow)
  • batch size: 128
  • number of epochs: 20
  • learning rate: 0.0005
  • dropout rate: 0.5

4. The approach for finding a solution and getting the validation set accuracy to be at least 0.93.

My final model results were:

  • validation set accuracy of 0.960
  • test set accuracy of 0.944

I started off with the standard LeNet architecture from class, because this architecture is able to classify 32p by 32p images quite good by default (as discussed in class). With standard LeNet, I reached a validation set accuracy of about 0.89.

To improve the accuracy, I added dropout to the fully connected layers of the network. I found a dropout rate of 0.5 to be the optimum for this architecture. I also doubled the size of each layer in the network to match the fact that the number of output classes in the German traffic sign dataset (n_classes = 43) is much higher than in the MNIST dataset (n_classes = 10). And also to match the fact that there is more information in the colored traffic sign images than in the grayscale images of the MNIST dataset.

The validation set accuracy of 0.96 - which is 0.03 points higher than the minimum expectation - shows that the model works well.

IV. Test a Model on New Images

1. Additional German traffic signs found on the web.

Here are not five but ten German traffic signs that I found on the web:

traffic sign image traffic sign image traffic sign image traffic sign image traffic sign image traffic sign image traffic sign image traffic sign image traffic sign image traffic sign image

The first image might be difficult to classify because is shows the traffic sign a litte bit from the side, so the image is a bit distorted. The other nine images should be easyer to classify.

2. Predictions on these new traffic signs.

Here are the results of the prediction:

Image (class and name) Prediction (class and name)
(1) Speed limit (30km/h) (1) Speed limit (30km/h)
(1) Speed limit (30km/h) (1) Speed limit (30km/h)
(1) Speed limit (30km/h) (1) Speed limit (30km/h)
(2) Speed limit (50km/h) (2) Speed limit (50km/h)
(11) Right-of-way at the next intersection (11) Right-of-way at the next intersection
(12) Priority road (12) Priority road
(40) Roundabout mandatory (40) Roundabout mandatory
(9) No passing (9) No passing
(9) No passing (9) No passing
(2) Speed limit (50km/h) (2) Speed limit (50km/h)

The model was able to correctly guess 10 of the 10 traffic signs, which gives an accuracy of 100%. This compares favorably to the accuracy on the test set of 94%.

3. Softmax probabilities of the predictions.

The code for making predictions on my final model is located in the 15th and 16th cell of the jupyter notebook.

For the first image, the model is very sure that this is a 30km/h speed limit (probability of 94.3%), and this prediction is correct. The top five soft max probabilities were

Probability [%] Prediction (class and name)
94.3 (1) Speed limit (30km/h)
3.2 (2) Speed limit (50km/h)
2.3 (25) Road work
0.1 (5) Speed limit (80km/h)
0.1 (31) Bicycles crossing

With speed limit (30km/h), speed limit (50km/h) and speed limit (80km/h) there are three very similar images in the top five. Even for a human it can be difficult to distinguish those traffic signs from the distance.

For the second and all the following images the model is 100% sure about its prediction, as you can see in the report. And they are all correct.