Faces are an identifying feature of human anatomy. Strictly speaking, many animals also have faces, but that is less relevant for most practical applications. Face detection tries to find (rectangular) areas in an image that represent faces. Face detection is a type of object detection, because faces are a type of object.
Most face detection algorithms are good at detecting clean fron-facing faces because most training images fall in that category. Tilted faces, bright lights, or noisy images may cause problems for face detection. It is possible to deduce age, gender, or ethnicity (for instance, the presence of epicanthic folds) from a face, which of course is useful for marketing.
A possible application could be analyzing profile pictures on social media sites. OpenCV uses a Haar feature-based cascade classifiers system to detect faces. The system is also named the Viola–Jones object detection framework after its inventorsr who proposed it in 2001.
The algorithm has the following steps:
When we look at face images, we can create heuristics related to brightness.
For instance, the nose region is brighter than regions directly to its left and right. Therefore, we can define a white rectangle covering the nose and black rectangles covering the neighboring areas. Of course the Viola-Jones system doesn't know exactly where the nose is, but by defining windows of varying size and seeking corresponding white and black rectangles, there is a chance of matching a nose. The actual Haar features are defined as the sum of brightness in a black rectangle and the sum of brightness in a neighboring rectangle. For a 24 x 24 window, we have more than 160 thousand features (roughly 24 to the fourth power).
The training set consists of a huge collection of positive (with faces) images and negative (no faces) images. Only about 0.01% of the windows (in the order of 24 by 24 pixels) actually contain faces. The cascade of classifiers progressively filters out negative image areas stage by stage. In each progressive stage, the classifiers use progressively more features on less image windows. The idea is to spend the most time on image patches that contain faces. The original paper by Viola and Jones had 38 stages with 1, 10, 25, 25, and 50 features in the first five stages. On average, 10 features per image window were evaluated.
In OpenCV, you can train a cascade classifier yourself, as described in http://docs.opencv.org/3.0.0/dc/d88/tutorial_traincascade.html (retrieved December 2015). However, OpenCV has pre-trained classifiers for faces, eyes, and other features. The configuration for these classifiers is stored as XML files, which can be found in the folder where you installed OpenCV (on my machine, /usr/local/share/OpenCV/haarcascades
/).
import cv2 from scipy.misc import lena import matplotlib.pyplot as plt import numpy as np import dautil as dl import os from IPython.display import HTML
def plot_with_rect(ax, img): img2 = img.copy() for x, y, w, h in face_cascade.detectMultiScale(img2, 1.3, 5): cv2.rectangle(img2, (x, y), (x + w, y + h), (255, 0, 0), 2) dl.plotting.img_show(ax, img2, cmap=plt.cm.gray)
# dir = '/usr/local/share/OpenCV/haarcascades/' base = 'https://raw.githubusercontent.com/Itseez/opencv/master/data/' url = base + 'haarcascades/haarcascade_frontalface_default.xml' path = os.path.join(dl.data.get_data_dir(), 'haarcascade_frontalface_default.xml') if not dl.conf.file_exists(path): dl.data.download(url, path) face_cascade = cv2.CascadeClassifier(path)
sp = dl.plotting.Subplotter(2, 2, context) img = lena().astype(np.uint8) plot_with_rect(sp.ax, img) sp.label()
rows, cols = img.shape mat = cv2.getRotationMatrix2D((cols/2, rows/2), 21, 1) rot = cv2.warpAffine(img, mat, (cols, rows)) plot_with_rect(sp.next_ax(), rot) sp.label()
np.random.seed(36) noisy = img * (np.random.random(img.shape) < 0.6) plot_with_rect(sp.next_ax(), noisy) sp.label()
blur = cv2.blur(img, (9, 9)) plot_with_rect(sp.next_ax(), blur) sp.label() HTML(sp.exit())
Refer to the following screenshot for the end result:
The code is in the detecting_faces.ipynb
file in this book's code bundle.