Accuracy is a metric that measures how well a model has performed in a given context. Accuracy is the default evaluation metric of scikit-learn classifiers. Unfortunately, accuracy is one-dimensional, and it doesn't help when the classes are unbalanced. The rain data we examined in Chapter 9, Ensemble Learning and Dimensionality Reduction, is pretty balanced. The number of rainy days is almost equal to the number of days on which it doesn't rain. In the case of e-mail spam classification, at least for me, the balance is shifted toward spam.
A confusion matrix is a table that is usually used to summarize the results of classification. The two dimensions of the table are the predicted class and the target class. In the context of binary classification, we talk about positive and negative classes. Naming a class negative is arbitrary—it doesn't necessarily mean that it is bad in some way. We can reduce any multi-class problem to one class versus the rest of the problem; so, when we evaluate binary classification, we can extend the framework to multi-class classification. A class can either be correctly predicted or not; we label those instances with the words true and false accordingly.
We have four combinations of true, false, positive, and negative, as described in the following table:
Predicted class | ||
---|---|---|
Target class |
True positives: It rained and we correctly predicted it. |
False positives: We incorrectly predicted that it would rain. |
False negatives: It did rain, but we predicted that it wouldn't. |
True negatives: It didn't rain, and we correctly predicted it. |
import numpy as np from sklearn.metrics import confusion_matrix import seaborn as sns import dautil as dl from IPython.display import HTML import ch10util
def plot_cm(preds, y_test, title, cax): cm = confusion_matrix(preds.T, y_test) normalized_cm = cm/cm.sum().astype(float) sns.heatmap(normalized_cm, annot=True, fmt='.2f', vmin=0, vmax=1, xticklabels=['Rain', 'No Rain'], yticklabels=['Rain', 'No Rain'], ax=cax) cax.set_xlabel('Predicted class') cax.set_ylabel('Expected class') cax.set_title('Confusion Matrix for Rain Forecast | ' + title)
y_test = np.load('rain_y_test.npy') sp = dl.plotting.Subplotter(2, 2, context) plot_cm(y_test, np.load('rfc.npy'), 'Random Forest', sp.ax) plot_cm(y_test, np.load('bagging.npy'), 'Bagging', sp.next_ax()) plot_cm(y_test, np.load('votes.npy'),'Votes', sp.next_ax()) plot_cm(y_test, np.load('stacking.npy'), 'Stacking', sp.next_ax()) sp.fig.text(0, 1, ch10util.classifiers()) HTML(sp.exit())
Refer to the following screenshot for the end result:
The source code is in the
conf_matrix.ipynb
file in this book's code bundle.
We displayed four confusion matrices for four classifiers, and the four numbers of each matrix seem to be repeating. Of course, the numbers are not exactly equal; however, you have to allow for some random variation.
confusion_matrix()
function documented at http://scikit-learn.org/stable/modules/generated/sklearn.metrics.confusion_matrix.html (retrieved November 2015)