Object Detection and Instance Segmentation with CNN

Until now, in this book, we have been mostly using convolutional neural networks (CNNs) for classification. Classification classifies the whole image into one of the classes with respect to the entity having the maximum probability of detection in the image. But what if there is not one, but multiple entities of interest and we want to have the image associated with all of them? One way to do this is to use tags instead of classes, where these tags are all classes of the penultimate Softmax classification layer with probability above a given threshold. However, the probability of detection here varies widely by size and placement of entity, and from the following image, we can actually say, How confident is the model that the identified entity is the one that is claimed? What if we are very confident that there is an entity, say a dog, in the image, but its scale and position in the image is not as prominent as that of its owner, a Person entity? So, a Multi-Class Tag is a valid way but not the best for this purpose:

In this chapter, we will cover the following topics:

The differences between object detection and image classification
Traditional, non-CNN approaches for object detection
Region-based CNN and its features
Fast R-CNN
Faster R-CNN
Mask R-CNN

Table of Contents for Object Detection and Instance Segmentation with CNN

Create new playlist

Sign In

Sign Up

Table of Contents for
Object Detection and Instance Segmentation with CNN