Advanced Lane Finding Project
The goals / steps of this project are the following:
- Compute the camera calibration matrix and distortion coefficients given a set of chessboard images.
- Apply a distortion correction to raw images.
- Use color transforms, gradients, etc., to create a thresholded binary image.
- Apply a perspective transform to rectify binary image ("birds-eye view").
- Detect lane pixels and fit to find the lane boundary.
- Determine the curvature of the lane and vehicle position with respect to center.
- Warp the detected lane boundaries back onto the original image.
- Output visual display of the lane boundaries and numerical estimation of lane curvature and vehicle position.
The very first task is to calibrate the camera so we can obtain an undistored view of the road later. Calibration is done using sample chessboard images. Please refer code in cell 2 of ipython notebook 'Pipeline-Final.ipynb'. To calibrate the camera, first read-in the chess board images and find the corners for each image using cv2.findChessboardcorners(). Then, define 'objpoints' as points on a grid (2-D plane). Finally, apply cv2.calibratecamera to transform the distorted chess board corners to undistorted grid points (objpoints) on a plane. This process gives the calibration matrix to get an undistorted view of the road.
The images below shows distorted and undistored view of chessboard.
Use the calibration matrix obtained in the above step to correct the distortion. The images shows distorted and undistorted view of road.
One of the main tasks of the pipeline is to obtain a binary image that we can use to identify lanes. In order to do this, I combined the gradients of grayscale image along with the gradient of saturation component of the image, to detect edges in the image. The function gradient_threshold() (Pipeline-Final.ipynb, cell 4) performs this task. Appropriate thresholds are also applied to the gradients to retain sharp edges. The figure below shows the thresholded binary image corresponding to the image show above.
The pipeline uses two matrices for perspective transform (TransM1, TransM2). The matrices are obtained using cv2.getPerspectiveTransform() (refer cell 5). TransM1 is used most of the time by the pipeline. TransM2 is used during initial frames to get an estimate of lane width. TransM2 matrix will provide a top-view of the portion of the road that is very close to the camera, so, we can cleary identify the lanes. TransM1, on the other hand, takes a longer view of the road ahead and gives us the top-view. These two matrices are obtained by manually mapping points on the lane to "estimated" top-view points, using the test images provided. The source and destination corners are shown in the notebook (refer src_corners, and dst_corners).
The images below shows the perspective transform (using TransM1) for the road images shown above (original image as well the thresholded image).
Now, we have the thresholded (and transformed) binary image, which clearly shows lane lines as white pixles. The next task is to find lanes lines from this image. The Pipeline uses two functions -findLanes_slidingwindow() and findLanes_extrapolate() - to fit lines to transformed thresholded binary image.
- findLanes_slidingwindow() uses sliding window technique starting from the bottom of the image to find non-zero pixels and fits a quadratic function to left and right white pixels.
- findLanes_extrapolate() extrapolates polynomials fits that are found for previous frames and tries to get a new fit for the current image.
These two functions are defined in cell 8 Pipeline-Final.ipynb. Images below shows the quadratic fit obtained by sliding window and extrapolation.
The Pipelines uses three measures to measure goodness of fit:
- confidence value - a measure of number of points used to fit. For ex, for the image shown above in the left, the confidence value for left and right lanes are 1 and 0.5 respectively. This is because all boxes in left side are filled with pixels. But, only 6 out 12 boxes are filled.
- r2 value - this measures mean square error of fit. Sometimes even if we have fewer points to fit, we can get a very good quadratic fit. These fits will have low error.
- Parallelism - This meaures the degree of parallelism between left and right fits.
** Algorithm **(refer lines 264- 397):
- Initially, the pipeline uses sliding window technique (findLanes_slidingwindow) to find lanes and stores them in the line objects. Furthermore, the top-view transformation for intial frames consider very close view of camera (TransM2), so we can detect both lanes accurately. During this process, the lane width is also calculated and stored.
- Later on, the pipeline uses extrapolation (findLanes_extrapolate) to extend previously fitted lines. If these lines are not good, then it tries sliding window again.
- Once we have polynomial fits for left and right lanes for a given image, the pipeline selects one or both of them using the quality measures discussed earlier. If both fits are good, it uses both. In some cases, only one of the two lanes may be accurately fitted. In this case, the pipeline uses this good fit as reference to calculate the fit for the other lane. The estimated lane width is used to get this fit. In some other cases, we may not have a good fit for both lanes; in this case, it uses average fit for both lanes.
The functions findLanes_slidingwindow/findLanes_extrapolate also calcuate the radius of curvature around lines 100-114 and 220 - 230 of cell 8.
The following parameters were chosen for scaling in x and y direction.
ym_per_pix = 30/720 (The top-view covered three white lines with 2 two spaces: 2 * 30 ft + 3 * 10 ft ~ 100 ft - 30m)
xm_per_pix = 3.7/600 (Lane width was 600 pixels after op view transformation)
The final ouput from the pipeline looks like images shown below, where the lane area is identified and shaded with green. The two images below are sample frames from project_video.mp4. We can see that the pipeline accurately detects both the lanes during these frames.
The link to video is below. The Pipeline finds good fit most of the time. The estimated radius of curvature is also shown on the video. The left and right radius of curvature is close to 1000m during curves. The video also shows deviation from the center. Most of the time the car seems to stay to the right-side of center.
There were couple of challenges: 0. First challenge was to get a good thresholded binary image that can accurately show lanes in project video as well as challenge video. The gradient_threshold function did a good job on project video, but not so good on the challenge video. I tried several things here, such as: 1) playing with thresholds, 2) improving image contrast (via histogram equalization), 3) detect yellow/white pixels and convert other pixels in image to black, so that lines are more visible. But none of them gave good performance. So, I used the quality measures to use drop bad fits, while using the good fits to estimate the bad ones.
- Second challenge was to set suitable thresholds for the quality measures defined in Section 4. I tried various values before choosing them. But, even in this case, the pipeline did not perform very well on challenge video (especially in the beginning). The main issue in challenge video, seems to be the "cracks" near the lanes, which makes it hard to find lanes. I believe, a lot more image processing (such as thresholding, remove shadows) is neeeded.