Understanding the Vision framework

The Vision framework is capable of many different tasks that revolve around computer vision. It is built upon several powerful deep learning techniques to enable state-of-the-art facial recognition, text recognition, barcode detection, and more. When you use Vision for facial recognition, you get much more information than just the location of a face in an image. The framework can recognize several facial landmarks, such as eyes, noses, or mouths. All this is possible due to the extensive use of deep learning behind the scenes at Apple.

For most tasks, using Vision consists of three stages:

  1. You create a request that specifies what you want. For instance, a VNDetectFaceLandmarksRequest request to detect facial features.
  2. You set up a handler that can analyze the images.
  3. The resulting observation contains the information you need.

The following code sample illustrates how you might find facial landmarks in an image:

let handler = VNImageRequestHandler(cgImage: image, options: [:])
let request = VNDetectFaceLandmarksRequest(completionHandler: { request, error in
  guard let results = request.results as? [VNFaceObservation]
    else { return }

  for result in results where result.landmarks != nil {
    let landmarks = result.landmarks!

    if let faceContour = landmarks.faceContour {
      print(faceContour.normalizedPoints)
    }

    if let leftEye = landmarks.leftEye {
      print(leftEye.normalizedPoints)
    }

    // etc
  }
})

try? handler.perform([request])

For something as complex as detecting the contour of a face or the exact location of an eye, the code is quite simple. You set up a handler and a request. Next, the handler is asked to perform one or more requests. This means that you can run several requests on a single image.

In addition to enabling computer vision tasks like this, the Vision framework also tightly integrates with CoreML. Let's see just how tight this integration is, by adding an image classifier to the augmented reality gallery app you have been working on!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset