Analyzing our results

As is evident, the accuracy achieved by soft voting is 2% worse than the best learner and on par with the second-best learner. We would like to analyze our results similarly to how we analyzed the performance of our hard voting custom implementation. But as soft voting takes into account the predicted class probabilities, we cannot use the same approach. Instead, we will plot the predicted probability for each instance to be classified as positive by each base learner as well as the average probability of the ensemble.

Again, we import matplotlib and set the plotting style:

# --- SECTION 1 ---
# Import the required libraries
import matplotlib as mpl
import matplotlib.pyplot as plt
mpl.style.use('seaborn-paper')

We calculate the ensemble's errors with errors = y_test-hard_predictions and get the predicted probabilities of each base learner with the predict_proba(x_test) function. All base learners implement this function, as it is a requirement for utilizing them in a soft voting ensemble:


# --- SECTION 2 ---
# Get the wrongly predicted instances
# and the predicted probabilities for the whole test set
errors = y_test-hard_predictions

probabilities_1 = learner_1.predict_proba(x_test)
probabilities_2 = learner_2.predict_proba(x_test)
probabilities_3 = learner_3.predict_proba(x_test)

Following this, for each wrongly classified instance, we store the predicted probability that the instance belongs to in class 0. We also implement this for each base learner, as well as their average. Each probabilities_* array, is a two-dimensional array. Each row contains the predicted probability that the corresponding instance belongs to class 0 or class 1. Thus, storing one of the two is sufficient. In the case of a dataset with N classes, we would have to store at least N-1 probabilities in order to get a clear picture:

# --- SECTION 2 ---
# Store the predicted probability for 
# each wrongly predicted instance, for each base learner
# as well as the average predicted probability
#
x=[]
y_1=[]
y_2=[]
y_3=[]
y_avg=[]

for i in range(len(errors)):
    if not errors[i] == 0:
         x.append(i)
         y_1.append(probabilities_1[i][0])
         y_2.append(probabilities_2[i][0])
         y_3.append(probabilities_3[i][0])
         y_avg.append((probabilities_1[i][0]+
                       probabilities_2[i][0]+probabilities_3[i][0])/3)

Finally, we plot the probabilities as bars of different widths with plt.bar. This ensures that any overlapping bars will still be visible. The third plt.bar argument dictates the bar's width. We scatter plot the average probability as a black 'X' and ensure that it will be plotted over any bar with zorder=10. Finally, we plot a threshold line at 0.5 probability with plt.plot(y, c='k', linestyle='--'), ensuring that it will be a black dotted line with c='k', linestyle='--'. If the average probability is above the line, the sample is classified as positive, as follows:


# --- SECTION 3 ---
# Plot the predicted probaiblity of each base learner as 
# a bar and the average probability as an X
plt.bar(x, y_1, 3, label='KNN')
plt.bar(x, y_2, 2, label='NB')
plt.bar(x, y_3, 1, label='SVM')
plt.scatter(x, y_avg, marker='x', c='k', s=150, 
            label='Average Positive', zorder=10)

y = [0.5 for x in range(len(errors))]
plt.plot(y, c='k', linestyle='--')

plt.title('Positive Probability')
plt.xlabel('Test sample')
plt.ylabel('probability')
plt.legend()
plt.show()

The preceding code outputs the following:

Predicted and average probabilities for the test set

As we can see, only two samples have an extreme average probability (sample 22 with p = 0.98 and 67 with p = 0.001). The other four are quite close to 50%. For three out of these four samples, SVM seems to assign a very high probability to the wrong class, thus greatly affecting the average probability. If SVM did not overestimate the probability of these samples as much, the ensemble could well out perform each individual learner. For the two extreme cases, nothing can be done, as all three learners agree on the miss classification. We can try to swap our SVM for another k-NN with a significantly higher number of neighbors. In this case, (learner_3 = neighbors.KNeighborsClassifier(n_neighbors=50) ), we can see that the ensemble's accuracy is greatly increased. The ensemble's accuracies and errors are as follows:

L1: 0.94
L2: 0.96
L3: 0.95
------------------------------
Hard Voting: 0.97

Take a look at the following screenshot:

Predicted and average probabilities for the test set with two k-NNs

Table of Contents for Analyzing our results

Create new playlist

Sign In

Sign Up

Table of Contents for
Analyzing our results