Detailed classification reports

The model is trained on the full development set. The scores are computed on the full evaluation set. Precision-recall f1-score support:

0 1.00 0.96 0.98 85296
 1 0.04 0.93 0.08 147
micro avg 0.96 0.96 0.96 85443
 macro avg 0.52 0.94 0.53 85443
 weighted avg 1.00 0.96 0.98 85443

We find the best hyperparameter optimizing for recall:

def print_gridsearch_scores(x_train_data,y_train_data):
 c_param_range = [0.01,0.1,1,10,100]

clf = GridSearchCV(LogisticRegression(), {"C": c_param_range}, cv=5, scoring='recall')
 clf.fit(x_train_data,y_train_data)

print "Best parameters set found on development set:"
print
print clf.bestparams

print "Grid scores on development set:"
 means = clf.cv_results_['mean_test_score']
 stds = clf.cv_results_['std_test_score']
 for mean, std, params in zip(means, stds, clf.cv_results_['params']):
 print "%0.3f (+/-%0.03f) for %r" % (mean, std * 2, params)
 
 return clf.best_params_["C"]

We find the best parameters set found on development, as shown here:

best_c = print_gridsearch_scores(X_train_undersample,y_train_undersample)

The output looks like this:

{'C': 0.01}

Grid scores on set:


 0.916 (+/-0.056) for {'C': 0.01}
 0.907 (+/-0.068) for {'C': 0.1}
 0.916 (+/-0.089) for {'C': 1}
 0.916 (+/-0.089) for {'C': 10}
 0.913 (+/-0.095) for {'C': 100}

Create a function to plot a confusion matrix. This function prints and plots the confusion matrix. Normalization can be applied by setting normalize=True:

import itertools

def plot_confusion_matrix(cm, classes,
 normalize=False,
 title='Confusion matrix',
 cmap=plt.cm.Blues):

plt.imshow(cm, interpolation='nearest', cmap=cmap)
 plt.title(title)
 plt.colorbar()
 tick_marks = np.arange(len(classes))
 plt.xticks(tick_marks, classes, rotation=0)
 plt.yticks(tick_marks, classes)

if normalize:
 cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
 #print("Normalized confusion matrix")
 else:
 1#print('Confusion matrix, without normalization')

thresh = cm.max() / 2.
 for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
 plt.text(j, i, cm[i, j],
 horizontalalignment="center",
 color="white" if cm[i, j] > thresh else "black")

plt.tight_layout()
 plt.ylabel('True label')
 plt.xlabel('Predicted label')

Table of Contents for Detailed classification reports

Create new playlist

Sign In

Sign Up

Table of Contents for
Detailed classification reports