We have managed to finish the training. It is time to evaluate the model. Before, we start evaluating the model, let's implement some auxiliary functions for plotting the example errors and printing the validation accuracy. The plot_example_errors() takes two parameters. The first is cls_pred, which is an array of the predicted class-number for all images in the test set.
The second parameter, correct, is a boolean array to predict whether the predicted class is equal to true class for each image in the test set. At first, it gets the images from the test set that have been incorrectly classified. Then it gets the predicted and the true classes for those images, and finally it plots the first nine images with their classes (that is, predicted versus true labels):
def plot_example_errors(cls_pred, correct): incorrect = (correct == False) images = data.valid.images[incorrect] cls_pred = cls_pred[incorrect] cls_true = data.valid.cls[incorrect] plot_images(images=images[0:9], cls_true=cls_true[0:9], cls_pred=cls_pred[0:9])
The second auxiliary function is called print_validation_accuracy(); it prints the validation accuracy. It allocates an array for the predicted classes, which will be calculated in batches and filled into this array, and then it calculates the predicted classes for the batches:
def print_validation_accuracy(show_example_errors=False, show_confusion_matrix=False): num_test = len(data.valid.images) cls_pred = np.zeros(shape=num_test, dtype=np.int) i = 0 while i < num_test: # The ending index for the next batch is denoted j. j = min(i + batch_size, num_test) images = data.valid.images[i:j, :].reshape(batch_size, img_size_flat) labels = data.valid.labels[i:j, :] feed_dict = {x: images, y_true: labels} cls_pred[i:j] = session.run(y_pred_cls, feed_dict=feed_dict) i = j cls_true = np.array(data.valid.cls) cls_pred = np.array([classes[x] for x in cls_pred]) correct = (cls_true == cls_pred) correct_sum = correct.sum() acc = float(correct_sum) / num_test msg = "Accuracy on Test-Set: {0:.1%} ({1} / {2})" print(msg.format(acc, correct_sum, num_test)) if show_example_errors: print("Example errors:") plot_example_errors(cls_pred=cls_pred, correct=correct)
Now that we have our auxiliary functions, we can start the optimization. At the first place, let's iterate the fine-tuning 10,000 times and see the performance:
optimize(num_iterations=1000)
After 10,000 iterations, we observe the following result:
Accuracy on Test-Set: 78.8% (3150 / 4000) Precision: 0.793378626929 Recall: 0.7875 F1-score: 0.786639298213
This means the accuracy on the test set is about 79%. Also, let's see how well our classifier performs on a sample image:
After that, we further iterate the optimization up to 100,000 times and observe better accuracy:
>>> Accuracy on Test-Set: 81.1% (3244 / 4000) Precision: 0.811057239265 Recall: 0.811 F1-score: 0.81098298755
So it did not improve that much but was a 2% increase on the overall accuracy. Now is the time to evaluate our model for a single image. For simplicity, we will take two random images of a dog and a cat and see the prediction power of our model:
At first, we load these two images and prepare the test set accordingly, as we have seen in an earlier step in this example:
test_cat = cv2.imread('Test_image/cat.jpg') test_cat = cv2.resize(test_cat, (img_size, img_size), cv2.INTER_LINEAR) / 255 preview_cat = plt.imshow(test_cat.reshape(img_size, img_size, num_channels)) test_dog = cv2.imread('Test_image/dog.jpg') test_dog = cv2.resize(test_dog, (img_size, img_size), cv2.INTER_LINEAR) / 255 preview_dog = plt.imshow(test_dog.reshape(img_size, img_size, num_channels))
Then we have the following function for making the prediction:
def sample_prediction(test_im): feed_dict_test = { x: test_im.reshape(1, img_size_flat), y_true: np.array([[1, 0]]) } test_pred = session.run(y_pred_cls, feed_dict=feed_dict_test) return classes[test_pred[0]] print("Predicted class for test_cat: {}".format(sample_prediction(test_cat))) print("Predicted class for test_dog: {}".format(sample_prediction(test_dog))) >>> Predicted class for test_cat: cats Predicted class for test_dog: dogs
Finally, when we're done, we close the TensorFlow session by invoking the close() method:
session.close()