Creating a stacking regressor class for scikit-learn

We can utilize the preceding code in order to create a reusable class that orchestrates the ensemble's training and prediction. All scikit-learn classifiers use the standard fit(x, y) and predict(x) methods, in order to train and predict respectively. First, we import the required libraries and declare the class and its constructor. The constructor's argument is a list of lists of scikit-learn classifiers. Each sub-list contains the level's learners. Thus, it is easy to construct a multi-level stacking ensemble. For example, a three-level ensemble can be constructed with StackingRegressor([ [l11, l12, l13],[ l21, l22], [l31] ]). We create a list of each stacking level's size (the number of learners) and also create deep copies of the base learners. The classifier in the last list is considered to be the meta-learner:

All of the following code, up to (not including) Section 5 (comment labels), is part of the StackingRegressor class. It should be properly indented if it is copied to a Python editor.

# --- SECTION 1 ---
# Libraries
import numpy as np

from sklearn.model_selection import KFold
from copy import deepcopy

class StackingRegressor():
  # --- SECTION 2 ---
  # The constructor 
  def __init__(self, learners):
    # Create a list of sizes for each stacking level
    # And a list of deep copied learners 
    self.level_sizes = []
    self.learners = []
    for learning_level in learners:
      self.level_sizes.append(len(learning_level))
      level_learners = []
      for learner in learning_level:
        level_learners.append(deepcopy(learner))
      self.learners.append(level_learners)

In following the constructor definition, we define the fit function. The only difference from the simple stacking script we presented in the preceding section is that instead of creating metadata for the meta-learner, we create a list of metadata, one for each stacking level. We save the metadata and targets in the meta_data, meta_targets lists and use data_z, target_z as the corresponding variables for each level. Furthermore, we train the level's learners on the metadata of the previous level. We initialize the metadata lists with the original training set and targets:

  # --- SECTION 3 ---
  # The fit function. Creates training metadata for every level
  # and trains each level on the previous level's metadata
  def fit(self, x, y):
    # Create a list of training metadata, one for each stacking level
    # and another one for the targets. For the first level, 
    # the actual data is used.
    meta_data = [x]
    meta_targets = [y]
    for i in range(len(self.learners)):
      level_size = self.level_sizes[i]

      # Create the metadata and target variables for this level
      data_z = np.zeros((level_size, len(x)))
      target_z = np.zeros(len(x))

      train_x = meta_data[i]
      train_y = meta_targets[i]

      # Create the cross-validation folds
      KF = KFold(n_splits=5)
      meta_index = 0
      for train_indices, test_indices in KF.split(x):
        # Train each learner on the K-1 folds and create
        # metadata for the Kth fold
        for j in range(len(self.learners[i])):

          learner = self.learners[i][j]
          learner.fit(train_x[train_indices], 
                train_y[train_indices])
          predictions = learner.predict(train_x[test_indices])

          data_z[j][meta_index:meta_index+len(test_indices)] = 
                              predictions

        target_z[meta_index:meta_index+len(test_indices)] = 
                          train_y[test_indices]
        meta_index += len(test_indices)

      # Add the data and targets to the metadata lists
      data_z = data_z.transpose()
      meta_data.append(data_z)
      meta_targets.append(target_z)


      # Train the learner on the whole previous metadata
      for learner in self.learners[i]:
        learner.fit(train_x, train_y)

Finally, we define the predict function, which creates metadata for each level for the provided test set, using the same logic as was used in fit (storing each level's metadata). The function returns the metadata for each level, as they are also the predictions of each level. The ensemble's output can be accessed with meta_data[-1]:


  # --- SECTION 4 ---
  # The predict function. Creates metadata for the test data and returns
  # all of them. The actual predictions can be accessed with 
  # meta_data[-1]
  def predict(self, x):

    # Create a list of training metadata, one for each stacking level
    meta_data = [x]
    for i in range(len(self.learners)):
      level_size = self.level_sizes[i]

      data_z = np.zeros((level_size, len(x)))

      test_x = meta_data[i]

      # Create the cross-validation folds
      KF = KFold(n_splits=5)
      for train_indices, test_indices in KF.split(x):
        # Train each learner on the K-1 folds and create
        # metadata for the Kth fold
        for j in range(len(self.learners[i])):

          learner = self.learners[i][j]
          predictions = learner.predict(test_x)
          data_z[j] = predictions

      # Add the data and targets to the metadata lists
      data_z = data_z.transpose()
      meta_data.append(data_z)

    # Return the meta_data the final layer's prediction can be accessed
    # With meta_data[-1]
    return meta_data

If we instantiate StackingRegressor with the same meta-learner and base learners as our regression example, we can see that it performs exactly the same! In order to access intermediate predictions, we must access the level's index plus one, as the data in meta_data[0] is the original test data:

# --- SECTION 5 ---
# Use the classifier
from sklearn.datasets import load_diabetes
from sklearn.neighbors import KNeighborsRegressor
from sklearn.tree import DecisionTreeRegressor
from sklearn.linear_model import LinearRegression, Ridge
from sklearn import metrics
diabetes = load_diabetes()

train_x, train_y = diabetes.data[:400], diabetes.target[:400]
test_x, test_y = diabetes.data[400:], diabetes.target[400:]

base_learners = []

knn = KNeighborsRegressor(n_neighbors=5)
base_learners.append(knn)

dtr = DecisionTreeRegressor(max_depth=4, random_state=123456)
base_learners.append(dtr)

ridge = Ridge()
base_learners.append(ridge)

meta_learner = LinearRegression()

# Instantiate the stacking regressor
sc = StackingRegressor([[knn,dtr,ridge],[meta_learner]])

# Fit and predict
sc.fit(train_x, train_y)
meta_data = sc.predict(test_x)

# Evaluate base learners and meta-learner
base_errors = []
base_r2 = []
for i in range(len(base_learners)):
  learner = base_learners[i]
  predictions = meta_data[1][:,i]
  err = metrics.mean_squared_error(test_y, predictions)
  r2 = metrics.r2_score(test_y, predictions)
  base_errors.append(err)
  base_r2.append(r2)

err = metrics.mean_squared_error(test_y, meta_data[-1])
r2 = metrics.r2_score(test_y, meta_data[-1])

# Print the results
print('ERROR R2 Name')
print('-'*20)
for i in range(len(base_learners)):
  learner = base_learners[i]
  print(f'{base_errors[i]:.1f} {base_r2[i]:.2f} 
      {learner.__class__.__name__}')
print(f'{err:.1f} {r2:.2f} Ensemble')

The results match with our previous example's result:

ERROR R2 Name
--------------------
2697.8 0.51 KNeighborsRegressor
3142.5 0.43 DecisionTreeRegressor
2564.8 0.54 Ridge
2066.6 0.63 Ensemble

In order to further clarify the relationships between the meta_data and self.learners lists, we graphically depict their interactions as follows. We initialize meta_data[0] for the sake of code simplicity. While it can be misleading to store the actual input data in the meta_data list, it avoids the need to handle the first level of base learners in a different way than the rest:

The relationships between each level of meta_data and self.learners

Table of Contents for Creating a stacking regressor class for scikit-learn

Create new playlist

Sign In

Sign Up

Table of Contents for
Creating a stacking regressor class for scikit-learn