Object Detection and Instance Segmentation with CNN

Until now, in this book, we have been mostly using convolutional neural networks (CNNs) for classification. Classification classifies the whole image into one of the classes with respect to the entity having the maximum probability of detection in the image. But what if there is not one, but multiple entities of interest and we want to have the image associated with all of them? One way to do this is to use tags instead of classes, where these tags are all classes of the penultimate Softmax classification layer with probability above a given threshold. However, the probability of detection here varies widely by size and placement of entity, and from the following image, we can actually say, How confident is the model that the identified entity is the one that is claimed? What if we are very confident that there is an entity, say a dog, in the image, but its scale and position in the image is not as prominent as that of its owner, a Person entity? So, a Multi-Class Tag is a valid way but not the best for this purpose:

In this chapter, we will cover the following topics:

  • The differences between object detection and image classification
  • Traditional, non-CNN approaches for object detection
  • Region-based CNN and its features
  • Fast R-CNN
  • Faster R-CNN
  • Mask R-CNN
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset