Behavioral Cloning Project
The goals / steps of this project are the following:
- Use the simulator to collect data of good driving behavior
- Build, a convolution neural network in Keras that predicts steering angles from images
- Train and validate the model with a training and validation set
- Test that the model successfully drives around track one without leaving the road
- Summarize the results with a written report
Here I will consider the rubric points individually and describe how I addressed each point in my implementation.
My project includes the following files:
- Model.ipynb This is a Google Colab Notebook where I created and trained the model, taking advantage of the GPU capabilities offered by Colab.
- model.py. This is a python file automatically generated by Colab for easy reading. The model itself was generated and trained in the Model.ipynb file.
- BehavioralClonning.html. This is a html file converted from the model.ipynb file for easy visualization
- drive.py for driving the car in autonomous mode (modification of the original)
- model.h5 containing a trained convolution neural network
- modelWithDrop3.h5. The same model
- writeup_report.md This report summarizing the results
- video.mp4. The video recording the vehicle traveling autonomously in the track.
- video30.mpy, video48.mp4, video60.mp4 the same video with different FPS
Using the Udacity provided simulator and my drive.py file, the car can be driven autonomously around the track by executing
python drive.py model.h5
The model.ipynb file contains the code for training and saving the convolutional neural network. Since it is a notebook it contains the code in well explained portions by using text.
The model.py file is the automatically generated text python version of the ipynb code.
There is also a html version of the ipynb file.
My model consists of a convolution neural network based on Nvidia End-to-End Deep Learning for Self-Driving Cars depicted by
with the following structure:
Layer | Output shape | #Param |
---|---|---|
conv2d (Conv2D) | (None, 31, 98, 24) | 1824 |
conv2d_1 (Conv2D) | (None, 14, 47, 36) | 21636 |
conv2d_2 (Conv2D) | (None, 5, 22, 48) | 43248 |
conv2d_3 (Conv2D) | (None, 3, 20, 64) | 27712 |
conv2d_4 (Conv2D) | (None, 1, 18, 64) | 36928 |
dropout (Dropout) | (None, 1, 18, 64) | 0 |
flatten (Flatten) | (None, 1152) | 0 |
dense (Dense) | (None, 100) | 115300 |
dropout_1 (Dropout) | (None, 100) | 0 |
dense_1 (Dense) | (None, 50) | 5050 |
dropout_2 (Dropout) | (None, 50) | 0 |
dense_2 (Dense) | (None, 10) | 510 |
dense_3 (Dense) | (None, 1) | 11 |
Total params: 252,219 | ||
Trainable params: 252,219 | ||
Non-trainable params: 0 |
As you can see the normalization is done outside the model in the preprocessing phase.
The model includes ELU layers which is similar to RELU for positive numbers, with y being the x value. but if it is negative it is not 0 but a value slightly below 0. which avoids the dead RELU problem in which neurons feed always values of 0 to other neurons therefore dying. Once a RELU ends up dead, it is unlikely to recover because gradient descent will not alter its weights.
I did not use Keras lambda layers because my normalization processes include OpenCV operations which proved difficult to implement with Lambda layers.
An example of the normalization performed can be seen in
In the left you have one sample image. In the right, the image has been first cut the first 60 rows above and the 25 rows below. Then the image was converted to YUV, a gaussian blur was applied , the image was resized to 200,66 and finally normalized to [0,1]
The model contains three dropout layers in order to reduce overfitting as you can see in the structure above.
One immediately after the convolution layers (before flattening), one after the first dense layer and one after the second dense layer.
I used train and validation set split as you can see in the line
X_train,X_valid,y_train,y_valid=train_test_split(im,st,test_size=0.2,random_state=6)
print('Train size',len(X_train))
print('Validation size',len(X_valid))
The result distribution is seen in
(model.ipynb Cell 16 )
The model was trained and validated on different data sets to ensure that the model was not overfitting. The model was tested by running it through the simulator and ensuring that the vehicle could stay on the track.
The model used an adam optimizer, so the learning rate was not tuned manually. The learning rate was 0.001. (model.ipynb Cell 39).
Training data was chosen to keep the vehicle driving on the road. Through the use of eliminating some excessive data from the central values (0.0) and image augmentation, I managed to get data suitable for the training phase.
For details about how I created the training data, see the next section.
As you can see in the BehavioralClonning.ipynb file, I first loaded the data that is stored in a different repository of my Github account. (I have used two different set of data, the results are quite similar). The set I am presenting in this report is in the BehavioralCloneData repository and is given by udacity. The other set is in the BehavioralCloningTrackData repository.
After that I used pandas to have the data of the csv file on a DataFrame. I plot the data to see the distribution:
As you can see there is an excesive amount of data for the central bin. If I use all these data, the car will be skewed toward always going straight which will produce crashes. So I eliminated the data in excess (more than 800)
Here you can see the distribution of data after eliminating the excess. You can notice two things: one, there is some tendency toward going to the left, which is obvious since the training track mainly goes to the left. The second is that there is very few data on the extremes which means there is few data for radical steering changes.
To combat this I later applied Image Augmentation so that the car could be trained to go to the right as well and also used the lateral cameras for steering changes.
It was time to get training and validation data. To do this, first I extracted the following data from the DataSet:
- center : The image path for the center camera
- left: The image path for the left camera
- right: The image path for the right camera
- steering: The steering angle.
Then I went through all data one by one and built two arrays, one with the image paths and one with the corresponding steering angle. For the center image this was just as it was, but for the left and right camera I added/substracted 0.25 to the steering angle.
Then I got these two arrays. It was time to construct the validation and training sets.
In order to have data to validate the model, I split the data into training and validation sets.(I separated 20% of data for validation)
You have seen already but I repeat here, the distribution of both sets is quite similar:
The final step was to run the simulator to see how well the car was driving around track one. There were a few spots where the vehicle fell off the track. To improve the driving behavior in these cases, I applied Data Augmentation
At the end of the process, the vehicle is able to drive autonomously around the track without leaving the road.
As I mentioned earlier, the final model architecture (model.ipynb Cell 39) consisted of a convolution neural network based on Nvidia End-to-End Deep Learning for Self-Driving Cars with the following layers and layer sizes ...
I reproduce again the structure of the model
The table containing the layer summary was written in the previous section
I have used the data provided by Udacity and have also used a different set of data. All the data is from track one. I preprocessed the data, first cropping the upper 60 lines and the lower 25 because they carry no data significant to the task. Then I changed the data to YUV, applied Gaussian Blur and resized the image to fit to the model input. Finally I normalized the data to [0,1]. The result is like this:
Then I also used data augmentation. For this I applied:
- Zooming:
I zoomed an image in the range of 0 to 50% as in the image:
- Pan
I translate an image in both x and y axis 10%.
- Brightness
I changed the brightness of the image by multiplying it by a value sampled from the range (0.2,1.2) (The Darker number of images is to be used rather frequently than Lighter ones)
- Flippling
This augmentation is very important to make sure that the data is not skewed toward some direction (for example toward the left)
When the image is flipped, the steering of course has to be flipped to the negative value.
I combined all these augmenatations in a function (random_augment) in which randomly an image can be zoomed, panned, made darker or lighter or flip As a result I have:
Finally I am going to describe the Batch Generation.
The data that is going to be input to the model is going to be provided by a batch generator function batch_generator.
The function distinguish if it is for training or validation. If it is for training it generates a batch of data that is going to be randomly augmented (as described before). If it is for validation it simply selects random data from the original data. The image is preprocessed before outputing it for the model.
An example is the following picture. In the left a batch (of 1) of images for training and another in the right for validation:
After training I also could see the evolution of loss for the training and validation sets in the following plot