Implementing an image classifier

The code bundle for this chapter contains an app called ImageAnalyzer. This app uses an image picker to allow a user to select an image from their photo library to use it as an input for the image classifier you will implement. Open the project and explore it for a little bit to see what it does and how it works. Use the starter project if you want to follow along with the rest of this section.

To add an image classifier, you need to have a CoreML model that can classify images. On Apple's machine learning website (https://developer.apple.com/machine-learning/build-run-models/) there are several models available that can do image classification. An excellent lightweight model you can use is the MobileNet model; go ahead and download it from the machine learning page. Once you have downloaded the model, drag the model into Xcode to add it to the ImageAnalyzer project. Make sure to add it to your app target so that Xcode can generate the class interface for the model.

After adding the model to Xcode, you can open it to examine the Model Evaluation Parameters. The parameters tell you the different types of inputs and outputs the model will expect and provide. In the case of MobileNet, the input should be an image that is 224 points wide and 224 points high, as shown in the following screenshot:

After generating the model, the code to use the model is very similar to the code used to detect facial features with Vision earlier. The most significant difference is that the type of request that is used is a special VNCoreMLRequest. This type of request takes the CoreML model you want to use, in addition to a completion handler.

When combining CoreML and Vision, Vision will take care of image scaling and converting the image to a type that is compatible with the CoreML model. You should make sure that the input image has the correct orientation. If your image is rotated in an unexpected orientation, CoreML might not be able to analyze it correctly.

Add the following implementation for analyzeImage(_:) to the ViewController class in the ImageAnalyzer project:

func analyzeImage(_ image: UIImage) {
  guard let cgImage = image.cgImage,
    let classifier = try? VNCoreMLModel(for: MobileNet().model)
    else { return }

  let request = VNCoreMLRequest(model: classifier, completionHandler: { [weak self] request, error in
    guard let classifications = request.results as? [VNClassificationObservation],
      let prediction = classifications.first
      else { return }

    DispatchQueue.main.async {
      self?.objectDescription.text = "(prediction.identifier) ((round(prediction.confidence * 100))% confidence"
    }
  })

  let handler = VNImageRequestHandler(cgImage: cgImage, options: [:])

  try? handler.perform([request])
}

The previous method takes a UIImage and converts it to a CGImage. Also, a VNCoreMLModel is created, based on the MobileNet model. This particular model class wraps the CoreML model, so it works seamlessly with Vision. The request is very similar to the request you have seen before. In the completionHandler, the results array and first prediction of the image classifications are extracted and shown to the user. Every prediction made by the classifier will have a label that's stored in the identifier and a confidence rating with a value between 0 and 1 stored in the confidence property. Note that the value of the description label is set on the main thread to avoid crashes.

You have already implemented two different types of CoreML models that were trained for general purposes. Sometimes, these models won't be specific enough for your purposes. For instance, take a look at the following screenshot, where a machine learning model labels a certain type of car as a sports car with only 30% confidence:

In the next section, you will learn how to train models for purposes that are specific to you and your apps by using CreateML.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset