This is the second project of Udacity's Self Driving Car Nano Degree Program. The goal is to write a software pipeline to identify the road lane boundaries in a video. The effect of lightness is minimized and the lane could either be straight or curved.
Advanced Lane Finding Project
The goals / steps of this project are the following:
- Compute the camera calibration matrix and distortion coefficients given a set of chessboard images.
- Apply a distortion correction to raw images.
- Use color transforms, gradients, etc., to create a thresholded binary image.
- Apply a perspective transform to rectify binary image ("birds-eye view").
- Detect lane pixels and fit to find the lane boundary.
- Determine the curvature of the lane and vehicle position with respect to center.
- Warp the detected lane boundaries back onto the original image.
- Output visual display of the lane boundaries and numerical estimation of lane curvature and vehicle position.
You can check /my_code/main.py
for the source code, output_image
for the test images output and output_video
for the video output.
Rubric Points
Here I will consider the rubric points individually and describe how I addressed each point in my implementation.
The function calibrate_camera
comutes the calibration and distortion coefficients using all the chessboard images provided in /camera_cal. It uses the "object points" that are reasonably specified by me and the "image points" detected by OpenCV. Then the function cv2.calibrateCamera()
computes the matrix and coefficients we need. These numbers can be used to undistort the image and below is a simple example:
The binary image is generated by function create_binary_image
. This is the most critical step. I started with a combination of x-gradient and s channel thresholding as demonstrated by the lecture. It works quite well for all the test images given; however, the detection in the video looks quite unstable even with smoothing.
Among those unstable detection, most are caused by the stains on the road, whose x-gradient magnitude is comparable to that of lane edges. To overcome this, a combination of RGB and HSL color space is used to make sure that the pixel marked is actually yellow or white. Also, the ideas of limiting the gradient direction I proposed in project 1 is introduced to further refine the detection. Therefore, the pixels in the binary image is set to 1 if:
1. It is in the region of interest (defined by polygon mask)
2. It is either white or yellow (S channel in HSL and R,G channel in RGB are used)
3. The direction of gradient is within some angles of the horizontal line
4. The x-gradient is large enough
sobel_mask
computes the x-gradient mask; dir_mask
computes the gradient direction mask and create_binary_image
generates the binary image. A sample binary image is given below:
To generate the transform matrix (and its inverse), I first plot the undistorted image and pick four points manually (source points). Matplotlib gives me the coordinates of pixel. Then the coordinates of destination points are provided to function cv2.getPerspectiveTransform
. Function 'perspective_transform' does the calculation for you and below is the a bird's-eye view of the detected lanes.
Two methods are provided to find the lane pixels.
If the previous detection fails or is not available, then we use a sliding window approach. When using this approach, we start from the bottom half of the image and compute the estimation of left and right lane at the bottom. This is done by finding the maximum element in binary image in the last row. Then the pixels in the sliding window are marked as lane pixels and the sliding window position is updated before moving up.
Sliding window is accurate but slow. When the previous detection is available, we can simply search around the lanes detected in the previous frame. It has been shown to be 30-40% faster on my machine and is less sensitive to the change of lightness.
find_lane_sliding_window
and find_lane_from_prior
implements these two approaches. After getting all the pixels, we simply fit a second order polynomial for the left/right lane. Below shows the lane found by this method:
Once the profile of left/right lane is found, it is very straightforward to compute the curvature and car position. The calculation is done in curvature_and_position
function.
The function process_image
combines all the previous steps and draw the detected lane on the original image. An example is shown here:
To improve the efficiency of the algorithm, a helper class Line
is created to swtich between sliding window method and the faster approach. Smoothing by averaging is also tried but it doesn't help in my case. It is probably because the additional thresholding (in RGB color space and gradient direction) is good enough to make stable prediction.
Here's a link to my video result
Two issues are encountered and one has been resolved. Both of them are discussed below.
When there is any shadow or stain on the road, the lane profile detected is not reasonable. As explained above, this is caused by the fact that we only check the x-gradient in S channel in the lecture. From my testing, the gradient is not always zero (contrary to what is claimed in the lecture) and could affect detection adversly. The solution is to check the color of pixel in RGB space instead and it works quite well.
The algorithm above works okay for the challenge_video
but fails in harder_challenge_video
. The root cause is that the sliding window approach doesn't always work. Interestingly, it is able to predict one lane (either left/right). So one improvement I can think of is to use the distance between left/right lane as the lane distance is known in most cases. To do this, we'll first define a cost function that represents the correctness of prediction for each lane. Then we accept the lane prediction of the higher score and reject the other one. Now we use the lane distance to do a refined search to detect the other lane. This should improve the detection quite a lot when only one prediction is reliable. However, I didn't have the time to test it as it requires reconstruct all the helper functions.