Goals Of the Project
- Use the simulator to collect data of good driving behavior
- Build, a convolution neural network in Keras that predicts steering angles from images
- Train and validate the model with a training and validation set
- Test that the model successfully drives around track one without leaving the road
- Summarize the results with a written report
My project includes the following files:
- model.py containing the script to create and train the model. It used keras v2.0.3 functional API
- generator.py containing the script to generate data
- drive.py for driving the car in autonomous mode
- model.h5 containing a trained convolution neural network
- writeup_report.md summarizing the results
Using the Udacity provided simulator and my drive.py file, the car can be driven autonomously around the first track by executing
python drive.py model.h5
Note: It is not able to drive above 20 mph
I've used more or less NVidia's model for self driving (https://devblogs.nvidia.com/parallelforall/deep-learning-self-driving-cars/)
The changes to the model include dropout layers in fully connected layers.
The following is the summary of the model:
Layer (type) | Output Shape | Param # |
---|---|---|
input_1 (InputLayer) | (None, 160, 320, 3) | 0 |
norm (Lambda) | (None, 160, 320, 3) | 0 |
crop (Cropping2D) (50 From top, 20 from bottom) | (None, 90, 320, 3) | 0 |
conv1 (Conv2D) kernel=(5x5), strides=(2,2) | (None, 43, 158, 24) | 1824 |
conv2 (Conv2D) kernel=(5x5), strides=(2,2) | (None, 20, 77, 36) | 21636 |
conv3 (Conv2D) kernel=(5x5), strides=(2,2) | (None, 8, 37, 48) | 43248 |
conv4 (Conv2D) kernel=(3x3), strides=(1,1) | (None, 6, 35, 64) | 27712 |
conv5 (Conv2D) kernel=(3x3), strides=(1,1) | (None, 4, 33, 64) | 36928 |
flatten (Flatten) | (None, 8448) | 0 |
nn1 (Dense) (Relu) | (None, 1164) | 9834636 |
dropout_1 (Dropout) (0.5) | (None, 1164) | 0 |
nn2 (Dense) (Relu) | (None, 100) | 116500 |
dropout_2 (Dropout) (0.5) | (None, 100) | 0 |
nn3 (Dense) (Relu) | (None, 50) | 5050 |
dropout_3 (Dropout) (0.5) | (None, 50) | 0 |
nn4 (Dense) (Relu) | (None, 10) | 510 |
output (Dense) | (None, 1) | 11 |
Total params: 10,088,055
-
More data was gathered to reduce overfitting:
- Simple runs on the track to keep car in center
- Drove in Opposite Direction to ensure there were some right turns
- Extreme steering angles by keeping the car on the right most or left most side and then steering it back to the center
-
The model contains dropout layers in order to reduce overfitting (model.py lines 47-51).
The model used an adam optimizer, so the learning rate was not tuned manually (model.py line 56).
Sample-data provided by Udacity was used along with data generated by driving myself. Drove the car in the opposite direction on the track to make sure right turns were also included in the data.
Then generated smaller multiple runs of extreme turns from the extreme edges of the road. But I think this is also the cause of a little over steering by the car.
Also data generated by myself needs to be further checked since at times the car does not maintain steering angle while turning but rather issues a series of jerky steering angles, each time coming back to 0. I would have liked it to maintain a smaller turning angle throughout the turn.
For details about how I created the training data, see the next section.
The overall strategy for deriving a model architecture was to have a few convolutional layers for feature extraction followed by some fully connected layers having a few dropouts to reduce overfitting (in case that happened).
My first step was to use a convolution neural network model similar to the NVidia model as I thought this model might be appropriate due to their success in using it in real life.
The initial trained model was not steering at all and was going straight into the sand. That is when I realized that the data I had collected was done using the keyboard and on the turns I wasn't able to maintain a steady angle throughout the turn therefore bad data -> bad learning happened. I then used mouse / joystick to make sure I drove around the circuit as much in the center as I could and on the turnings I tried to keep a steady angle.
To make sure the data was somewhat uniform, a histogram was generated to figure out the kind of data was available:
Steering Angle | Number of Data points | % |
---|---|---|
-1.00 | 4 | 0.050 % |
-0.90 | 1 | 0.012 % |
-0.80 | 2 | 0.025 % |
-0.70 | 6 | 0.075 % |
-0.60 | 11 | 0.137 % |
-0.50 | 62 | 0.772 % |
-0.40 | 78 | 0.971 % |
-0.30 | 300 | 3.733 % |
-0.20 | 473 | 5.886 % |
-0.10 | 838 | 10.428 % |
0.00 | 5085 | 63.278 % |
0.10 | 781 | 9.719 % |
0.20 | 161 | 2.003 % |
0.30 | 172 | 2.140 % |
0.40 | 42 | 0.523 % |
0.50 | 12 | 0.149 % |
0.60 | 5 | 0.062 % |
0.70 | 1 | 0.012 % |
0.80 | 0 | 0.000 % |
0.90 | 2 | 0.025 % |
Looking at the chart above it was clear that more steering angle data was required. Some augmenting techniques such as brightness decrease, flipping images and using left / right camera images were used.
After augmenting data the following is the distribution:
Steering Angle | Number of Data points | % |
---|---|---|
-1.00 | 84 | 0.186 % |
-0.90 | 44 | 0.097 % |
-0.80 | 106 | 0.235 % |
-0.70 | 144 | 0.319 % |
-0.60 | 341 | 0.755 % |
-0.50 | 887 | 1.963 % |
-0.40 | 6821 | 15.093 % |
-0.30 | 2049 | 4.534 % |
-0.20 | 3506 | 7.758 % |
-0.10 | 3319 | 7.344 % |
0.00 | 15461 | 34.212 % |
0.10 | 2561 | 5.667 % |
0.20 | 6833 | 15.120 % |
0.30 | 1470 | 3.253 % |
0.40 | 915 | 2.025 % |
0.50 | 223 | 0.493 % |
0.60 | 188 | 0.416 % |
0.70 | 109 | 0.241 % |
0.80 | 15 | 0.033 % |
0.90 | 37 | 0.082 % |
Once the chosen model was able to drive successfully at 10mph I tried increasing the speed but wasn't able to go higher than 20 mph. However, I noticed from the feature extraction of conv1 layer that a lot of the filters have more or less same information. Hence, I decided to cut out some of the layers, reduced the filter sizes, included MaxPooling2D but overall the performance of the car reamined the same. Therefore I've stuck to the NVidia model.
Following is the code that has been used for producing this:
def outputFeatureMap(sess, ac, plt_num=1):
featuremaps = ac.shape[3]
for i in range(featuremaps):
image_to_show = (ac[0,:,:,i] + 0.5) * 255
plt.imshow(image_to_show, interpolation="nearest", cmap="gray")
plt.show()
input_tensor = model.layers[0]
norm_tensor = model.layers[1]
crop_tensor = model.layers[2]
conv1_tensor = model.layers[3]
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
image_norm = norm_tensor.output.eval(session=sess, feed_dict = {norm_tensor.input: image})
image_cropped = crop_tensor.output.eval(session=sess, feed_dict = {crop_tensor.input : image_norm})
ac = conv1_tensor.output.eval(session=sess, feed_dict = {conv1_tensor.input: image_cropped})
outputFeatureMap(sess, ac, 3)
At the end of the process, the vehicle is able to drive autonomously around track 1 without leaving the road.
Here is a visualization of the architecture
The following is the summary of the model:
Layer (type) | Output Shape |
---|---|
input_1 (InputLayer) | (None, 160, 320, 3) |
norm (Lambda) | (None, 160, 320, 3) |
crop (Cropping2D) (50 From top, 20 from bottom) | (None, 90, 320, 3) |
conv1 (Conv2D) kernel=(5x5), strides=(2,2) | (None, 43, 158, 24) |
conv2 (Conv2D) kernel=(5x5), strides=(2,2) | (None, 20, 77, 36) |
conv3 (Conv2D) kernel=(5x5), strides=(2,2) | (None, 8, 37, 48) |
conv4 (Conv2D) kernel=(3x3), strides=(1,1) | (None, 6, 35, 64) |
flatten (Flatten) | (None, 8448) |
nn1 (Dense) (Relu) | (None, 1164) |
dropout_1 (Dropout) (0.5) | (None, 1164) |
nn2 (Dense) (Relu) | (None, 100) |
dropout_2 (Dropout) (0.5) | (None, 100) |
nn3 (Dense) (Relu) | (None, 50) |
dropout_3 (Dropout) (0.5) | (None, 50) |
nn4 (Dense) (Relu) | (None, 10) |
output (Dense) | (None, 1) |
A custom generator has been written (generator.py) that combines images / steering_angle from different folders, chooses validation set and then augments the data.
Images are augmented on the fly by keeping the particular augmentation function (e.g. flipping, darkness etc.) in a list and by keeping only the indices to be returned in an array. When a batch is required the particular indices are retrieved, augmentation function is applied and then the batch of images is returned (generator.py line # 266 to 293).
At the start of training, a validation set is randomly chosen from the set of images that does not include any of the augmented left / right steering images. This was done to keep the validation set to a realistic value and since left / right steering angles have been chosen it was thought that it would be best to leave them out.
The generator uses Keras callback mechanism to make sure that training data is shuffled on each epoch. The validation set is only chosen once at the begining of the training and then it is never shuffled or regenerated.
Note: I am not sure what effect this validation actually had on the training and this needs to be further looked into
- Image Flipping: Original images that were found in folders have been flipped and their steering_angle has been negated
- Brightness: New images consisting of decreased brightness from the set of Original + Flipped images are added
- Left / Right: Left and right images are added to the set by using a constant factor of 0.2
To make sure car would be able to recover from the extreme right or left of the road, I recorded some sessions with such kind of driving. Few examples include:
Flipped Image Sample
Changed Brightness Sample
Although I collected a few runs from track 2 but did not use them for the project since I've seen people being able to use only track 1 data to train and succcessfuly being able to run on track 2 as well.
However my model does not run correctly on track 2 at all. I would need to look into the feature extraction from conv1 to figure out why it works so bad
I used ModelCheckpoint to automatically save the best score acheived on the validation set.
mc = ModelCheckpoint(filepath=FLAGS.savefile, verbose=1, save_best_only=True)
- While driving autonomously there are points when the car oscillates left and right. That particular scene needs to be debugged and figured out why it does that
- Model does not work beyond 20 mph
- Model does not work at all on track 2
- Training data is not uniformly distributed. From the histogram it is evident that there are a lot more straight (0 steering angle) samples than left and right ones.