Voting

We will try to combine three basic regression algorithms by voting to improve the MSE of the simple regression. To combine the algorithms, we will utilize the average of their predictions. Thus, we code a simple class that creates a dictionary of base learners and handles their training and prediction averaging. The main logic is the same as with the custom voting classifier we implemented in Chapter 3, Voting:

import numpy as np
from copy import deepcopy

class VotingRegressor():

# Accepts a list of (name, classifier) tuples
def __init__(self, base_learners):
self.base_learners = {}
for name, learner in base_learners:
self.base_learners[name] = deepcopy(learner)

# Fits each individual base learner
def fit(self, x_data, y_data):
for name in self.base_learners:
learner = self.base_learners[name]
learner.fit(x_data, y_data)

The predictions are stored in a NumPy matrix, in which each row corresponds to a single instance and each column corresponds to a single base learner. The row-averaged values are the ensemble's output, as shown here:

# Generates the predictions
def predict(self, x_data):

# Create the predictions matrix
predictions = np.zeros((len(x_data), len(self.base_learners)))

names = list(self.base_learners.keys())

# For each base learner
for i in range(len(self.base_learners)):
name = names[i]
learner = self.base_learners[name]

# Store the predictions in a column
preds = learner.predict(x_data)
predictions[:,i] = preds

# Take the row-average
predictions = np.mean(predictions, axis=1)
return predictions

We chose to utilize a support vector machine, a K-Nearest Neighbors Regressor, and a linear regression as a base learners, as they provide diverse learning paradigms. To utilize the ensemble, we first import the required modules:

import numpy as np
import pandas as pd

from simulator import simulate
from sklearn import metrics
from sklearn.neighbors import KNeighborsRegressor
from sklearn.linear_model import LinearRegression
from sklearn.svm import SVR
from voting_regressor import VotingRegressor

Next, in the code we presented earlier, we replace the lr = LinearRegression() line with the following:

base_learners = [('SVR', SVR()),
('LR', LinearRegression()),
('KNN', KNeighborsRegressor())]

lr = VotingRegressor(base_learners)

By adding the two additional regressors, we are able to reduce the MSE to 16.22 and produce a Sharpe value of 0.22.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset