Evaluating the classifiers

Once the model is trained, we need to evaluate its performance. To do that, we will use the following process:

  1. We will divide the labeling dataset into two parts—a training partition and a testing partition. We will use the testing partition to evaluate the trained model.
  2. We will use the features of our testing partition to generate labels for each row. This is our set of predicted labels.
  3. We will compare the set of predicted labels with the actual labels to evaluate the model.
Unless we are trying to solve something quite trivial, there will be some misclassifications when we evaluate the model. How we interpret these misclassifications to determine the quality of the model depends on which performance metrics we choose to use.

Once we have both the set of actual labels and the predicted labels, a bunch of performance metrics can be used to evaluate the models. The best metric to quantify the model will depend on the requirements of the business problem that we want to solve, as well as the characteristics of the training dataset.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset