How does deep learning become a state-of-the-art solution?

As we can see, robust object recognition is critical to realizing driving autonomy. To avoid accidents and ensure safety, it is necessary to be attentive to the surrounding environment, traffic signs, and lights. Generally speaking, object recognition in self-driving cars can be summarized into four tasks:

Object detection, such as obstacles, pedestrians, traffic signs, and lights.
Object identification and classification. An example is of labeling traffic lights (red, yellow, green, and off) if detected in the images captured by the frontal camera. Then we have categorizing of traffic participants into bicycle, motorcycle, car, truck and bus, and of course classifying traffic signs (our main talking point in this chapter).
Object localization, which maps ground-level images to aerial imagery.
Movement prediction, for example, understanding the speed of an object or estimating the behavior and intention of a pedestrian based on his/her pose.

Since the past two decades, a variety of machine learning algorithms have been applied to solve object recognition problems in intelligent vehicles.

For example, in Detecting Pedestrians Using Patterns of Motion and Appearance (Viola et al., published in the International Journal of Computer Vision, 63(2)), the AdaBoost (short for Adaptive Boosting, which corrects classification errors sequentially) classifier was employed to detect walking pedestrians.

In Histograms of Oriented Gradients for Human Detection (Dalal and Triggs, published in the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) in 2005), efficient features were extracted using the histogram of orientation (HOG) technique, and fed into a support vector machine (SVM) classifier for human detection. Since then, more sophisticated variants such as gradient field HOG (GF-HOG) and other more complex feature extraction methods were developed. To name some, we have zoning + projection, projection + HOG, and so on.

Conventional object recognition approaches (explicit feature extraction + machine learning classification) rely heavily on hand-crafted features, such as gradient orientation histogram with HOG, local keypoints with Sped-Up Robust Features (SURF), or Scale Invariant Feature Transform (SIFT). Although they perform well in certain tasks, designing these feature descriptors is difficult and requires lots of manual tweaks and experiments.

Recall that in the previous chapter on classifying handwritten digits, we resorted to a CNN. It first derives low-level representations, local edges and curves, and then composes higher level representations such as overall shape and contour through a series of low-level representations. We also concluded that CNNs are well suited to exploit strong and unique features.

In fact, it has been proven in many solutions that CNNs are able to efficiently automate feature extraction while allowing a significant boost in performance. For example, in Pedestrian Detection with Unsupervised Multi-Stage Feature Learning (Sermanet et al., published in the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) in 2013), CNNs were first introduced into the pedestrian detection problem. In Rich feature hierarchies for accurate object detection and semantic segmentation (Girshick et al., published in the IEEE Conference on CVPR in 2014), a variant region-based CNN model was proposed to improve performance. Nowadays, a number of state-of-the-art object recognition approaches involve deep learning techniques, CNNs specifically. A good testimony would be their prevalence in top positions in the leaderboard of the KITTI Vision Benchmark for autonomous cars (http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=2d).

I hope all of these cases have excited you about CNNs and their power of providing better object recognition solutions to intelligent vehicles.

So what are we waiting for? Let's proceed with our project, traffic signs recognition, as it is one of the most important topics in autonomous cars!

Table of Contents for How does deep learning become a state-of-the-art solution?

Create new playlist

Sign In

Sign Up

Table of Contents for
How does deep learning become a state-of-the-art solution?