Hard voting implementation

Similarly to our custom implementation, we import the required libraries, split our train and test data, and instantiate our base learners. Furthermore, we import scikit-learn's VotingClassifier voting implementation from the sklearn.ensemble package, as follows:

# --- SECTION 1 ---
# Import the required libraries
from sklearn import datasets, linear_model, svm, neighbors
from sklearn.ensemble import VotingClassifier
from sklearn.metrics import accuracy_score
# Load the dataset
breast_cancer = datasets.load_breast_cancer()
x, y = breast_cancer.data, breast_cancer.target

# Split the train and test samples
test_samples = 100
x_train, y_train = x[:-test_samples], y[:-test_samples]
x_test, y_test = x[-test_samples:], y[-test_samples:]

# --- SECTION 2 ---
# Instantiate the learners (classifiers)
learner_1 = neighbors.KNeighborsClassifier(n_neighbors=5)
learner_2 = linear_model.Perceptron(tol=1e-2, random_state=0)
learner_3 = svm.SVC(gamma=0.001)

Following the above code, we instantiate the VotingClassifier class, passing as a parameter a list of tuples with the names and objects of our base classifiers. Note that passing the parameters outside of a list will result in an error:

# --- SECTION 3 ---
# Instantiate the voting classifier
voting = VotingClassifier([('KNN', learner_1),
('Prc', learner_2),
('SVM', learner_3)])

Now, having instantiated the classifier, we can use it in the same way as any other classifier, without having to tend to each base learner individually. The following two sections execute the fitting and prediction for all base learners as well as the calculation of the most voted class for each test instance:

# --- SECTION 4 ---
# Fit classifier with the training data
voting.fit(x_train, y_train)

# --- SECTION 5 ---
# Predict the most voted class
hard_predictions = voting.predict(x_test)

Finally, we can print the accuracy of the ensemble:

# --- SECTION 6 ---
# Accuracy of hard voting
print('-'*30)
print('Hard Voting:', accuracy_score(y_test, hard_predictions))

This is the same as our custom implementation:

------------------------------
Hard Voting: 0.95

Note that 
VotingClassifier will not fit the objects that you pass as parameters, but will, instead, clone them and fit the cloned objects. Thus, if you try to print the accuracy of each individual base learner on the test set, you will get NotFittedError, as the objects that you have access to are, in fact, not fitted. This is the only drawback of using scikit-learn's implementation over a custom one.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset