The Matthews correlation coefficient (MCC) or phi coefficient is an evaluation metric for binary classification invented by Brian Matthews in 1975. The MCC is a correlation coefficient for target and predictions and varies between -1 and 1 (best agreement). MCC is a very good way to summarize the confusion matrix (refer to the Getting classification straight with the confusion matrix recipe) as it uses all four numbers in it. The MCC is given by the following equation:
import dautil as dl from sklearn import metrics import numpy as np import ch10util from IPython.display import HTML
y_test = np.load('rain_y_test.npy') accuracies = [metrics.accuracy_score(y_test, preds) for preds in ch10util.rain_preds()] precisions = [metrics.precision_score(y_test, preds) for preds in ch10util.rain_preds()] recalls = [metrics.recall_score(y_test, preds) for preds in ch10util.rain_preds()] f1s = [metrics.f1_score(y_test, preds) for preds in ch10util.rain_preds()] mc = [metrics.matthews_corrcoef(y_test, preds) for preds in ch10util.rain_preds()]
sp = dl.plotting.Subplotter(2, 2, context) dl.plotting.plot_text(sp.ax, accuracies, mc, ch10util.rain_labels(), add_scatter=True) sp.label() dl.plotting.plot_text(sp.next_ax(), precisions, mc, ch10util.rain_labels(), add_scatter=True) sp.label() dl.plotting.plot_text(sp.next_ax(), recalls, mc, ch10util.rain_labels(), add_scatter=True) sp.label() dl.plotting.plot_text(sp.next_ax(), f1s, mc, ch10util.rain_labels(), add_scatter=True) sp.label() sp.fig.text(0, 1, ch10util.classifiers()) HTML(sp.exit())
Refer to the following screenshot for the end result:
The code is in the matthews_correlation.ipynb
file in this book's code bundle.
matthews_corrcoef()
function documented at http://scikit-learn.org/stable/modules/generated/sklearn.metrics.matthews_corrcoef.html (retrieved November 2015)