Evaluation metrics

The evaluation metric is used to measure model performance. While similar to the loss functions, it is not used for making corrections while training the model. It is only used after the model has been trained to evaluate performance:

Accuracy: This metric measures how often the correct class is predicted. By default, 0.5 is used as a threshold, which means that if the predicted probability is below 0.5, then the predicted class is 0; otherwise, it is 1. The total number of cases where the predicted class matches the target class is divided by the total number of target variables.
Cosine similarity: Compares the similarity between two vectors by evaluating the similarity of terms in n-dimensional space. This is used often to evaluate the similarity of text data. For this, we can imagine a piece of text with the word cat four times and the word dog once, and another with the word cat four times and the word dog twice. In this case, with two dimensions, we could envision a line from the origin through the point where y = 4 and x = 1 for the first piece of text and another line through the point where y = 4 and x = 2 for the next piece of text. If we evaluate the angle between the two lines, we will arrive at the cosine value used to determine similarity. If the lines overlap, then the documents have a perfect similarity score of 1, and the larger the angle between the lines, the less similar and the lower the similarity score will be.
Mean absolute error: The average value for all absolute errors. The absolute error is the difference between the predicted variable and the target variable.
Mean squared error: The average value for all squared errors. The squared error is the square of the difference between the predicted variable and the target variable. Through squaring the errors, an increased penalty is applied to larger errors relative to mean absolute error.
Hinge: To use the hinge evaluation metric, all target variables should be -1 or 1. From here, the formula is to subtract the product of the predicted value and the target variable from 1 and then to use this value or 0, whichever is greater for evaluation. Results are evaluated as more correct the closer the metric value is to 0.
KL divergence: This metric compares the distribution of true results with the distribution of predicted results and evaluates the similarity of the distribution.

We have, so far, used a tree-based classifier and a traditional neural network to classify our image data. We have also reviewed the keras syntax and looked at our options for several functions within the modeling pipeline using this framework. Next, we will add the additional layers before the neural network that we just coded to create a convolutional neural network. For this special type of neural network, we will include a convolution layer and a pooling layer. A dropout layer is often also included; however, we will add this later as it serves a slightly different purpose than the convolution and pooling layers. When used together, these layers find more complex patterns in our data and also reduce the size of our data, which is especially important when working with large image files.

Table of Contents for Evaluation metrics

Create new playlist

Sign In

Sign Up

Table of Contents for
Evaluation metrics