Step 8 &#x2013; Model evaluation

We have managed to finish the training. It is time to evaluate the model. Before, we start evaluating the model, let's implement some auxiliary functions for plotting the example errors and printing the validation accuracy. The plot_example_errors() takes two parameters. The first is cls_pred, which is an array of the predicted class-number for all images in the test set.

The second parameter, correct, is a boolean array to predict whether the predicted class is equal to true class for each image in the test set. At first, it gets the images from the test set that have been incorrectly classified. Then it gets the predicted and the true classes for those images, and finally it plots the first nine images with their classes (that is, predicted versus true labels):

def plot_example_errors(cls_pred, correct): 
    incorrect = (correct == False)     
    images = data.valid.images[incorrect]     
    cls_pred = cls_pred[incorrect] 
    cls_true = data.valid.cls[incorrect]     
    plot_images(images=images[0:9], cls_true=cls_true[0:9], cls_pred=cls_pred[0:9])

The second auxiliary function is called print_validation_accuracy(); it prints the validation accuracy. It allocates an array for the predicted classes, which will be calculated in batches and filled into this array, and then it calculates the predicted classes for the batches:

def print_validation_accuracy(show_example_errors=False, show_confusion_matrix=False): 
    num_test = len(data.valid.images) 
    cls_pred = np.zeros(shape=num_test, dtype=np.int) 
    i = 0 
    while i < num_test: 
        # The ending index for the next batch is denoted j. 
        j = min(i + batch_size, num_test) 
        images = data.valid.images[i:j, :].reshape(batch_size, img_size_flat)    
        labels = data.valid.labels[i:j, :] 
        feed_dict = {x: images, y_true: labels} 
        cls_pred[i:j] = session.run(y_pred_cls, feed_dict=feed_dict) 
        i = j 
 
    cls_true = np.array(data.valid.cls) 
    cls_pred = np.array([classes[x] for x in cls_pred])  
    correct = (cls_true == cls_pred) 
    correct_sum = correct.sum() 
    acc = float(correct_sum) / num_test 
 
    msg = "Accuracy on Test-Set: {0:.1%} ({1} / {2})" 
    print(msg.format(acc, correct_sum, num_test)) 
 
    if show_example_errors: 
        print("Example errors:") 
        plot_example_errors(cls_pred=cls_pred, correct=correct)

Now that we have our auxiliary functions, we can start the optimization. At the first place, let's iterate the fine-tuning 10,000 times and see the performance:

 optimize(num_iterations=1000)

After 10,000 iterations, we observe the following result:

Accuracy on Test-Set: 78.8% (3150 / 4000) 
Precision: 0.793378626929 
Recall: 0.7875 
F1-score: 0.786639298213

This means the accuracy on the test set is about 79%. Also, let's see how well our classifier performs on a sample image:

Figure 7: Random prediction on the test set (after 10,000 iterations)

After that, we further iterate the optimization up to 100,000 times and observe better accuracy:

Figure 8: Random prediction on the test set (after 100,000 iterations)

>>> 
Accuracy on Test-Set: 81.1% (3244 / 4000) 
Precision: 0.811057239265 
Recall: 0.811 
F1-score: 0.81098298755

So it did not improve that much but was a 2% increase on the overall accuracy. Now is the time to evaluate our model for a single image. For simplicity, we will take two random images of a dog and a cat and see the prediction power of our model:

Figure 9: Example image for the cat and dog to be classified

At first, we load these two images and prepare the test set accordingly, as we have seen in an earlier step in this example:

test_cat = cv2.imread('Test_image/cat.jpg') 
test_cat = cv2.resize(test_cat, (img_size, img_size), cv2.INTER_LINEAR) / 255 
preview_cat = plt.imshow(test_cat.reshape(img_size, img_size, num_channels)) 
 
test_dog = cv2.imread('Test_image/dog.jpg') 
test_dog = cv2.resize(test_dog, (img_size, img_size), cv2.INTER_LINEAR) / 255 
preview_dog = plt.imshow(test_dog.reshape(img_size, img_size, num_channels))

Then we have the following function for making the prediction:

def sample_prediction(test_im):     
    feed_dict_test = { 
        x: test_im.reshape(1, img_size_flat), 
        y_true: np.array([[1, 0]]) 
    } 
    test_pred = session.run(y_pred_cls, feed_dict=feed_dict_test) 
    return classes[test_pred[0]] 
print("Predicted class for test_cat: {}".format(sample_prediction(test_cat))) 
print("Predicted class for test_dog: {}".format(sample_prediction(test_dog))) 
 
>>>  
Predicted class for test_cat: cats 
Predicted class for test_dog: dogs

Finally, when we're done, we close the TensorFlow session by invoking the close() method:

session.close()

Table of Contents for
Step 8 – Model evaluation

Step 8 – Model evaluation

Table of Contents for Step 8 &#x2013; Model evaluation

Create new playlist

Sign In

Sign Up

Table of Contents for
Step 8 – Model evaluation