Implementing a boosting regressor

Implementing a boosted regressor follows the same syntax as the boosted classifier:

In [15]: from sklearn.ensemble import GradientBoostingRegressor
... boost_reg = GradientBoostingRegressor(n_estimators=10,
... random_state=3)

We have seen earlier that a single decision tree can achieve 79.3% accuracy on the Boston dataset. A bagged decision tree classifier made of 10 individual regression trees achieved 82.7% accuracy. But how does a boosted regressor compare?

Let's reload the Boston dataset and split it into training and test sets. We want to make sure we use the same value for random_state so that we end up training and testing on the same subsets of the data:

In [16]: dataset = load_boston()
... X = dataset.data
... y = dataset.target
In [17]: X_train, X_test, y_train, y_test = train_test_split(
... X, y, random_state=3
... )

As it turns out, the boosted decision tree ensemble actually performs worse than the previous code:

In [18]: boost_reg.fit(X_train, y_train)
... boost_reg.score(X_test, y_test)
Out[18]: 0.71991199075668488

This result might be confusing at first. After all, we used 10 times more classifiers than we did for the single decision tree. Why would our numbers get worse?

You can see this is a good example of an expert classifier being smarter than a group of weak learners. One possible solution is to make the ensemble larger. In fact, it is customary to use in the order of 100 weak learners in a boosted ensemble:

In [19]: boost_reg = GradientBoostingRegressor(n_estimators=100)

Then, when we retrain the ensemble on the Boston dataset, we get a test score of 89.8%:

In [20]: boost_reg.fit(X_train, y_train)
... boost_reg.score(X_test, y_test)
Out[20]: 0.89984081091774459

What happens when you increase the number to n_estimators=500? There's a lot more we could do by playing with the optional parameters.

As you can see, boosting is a powerful procedure that allows you to get massive performance improvements by combining a large number of relatively simple learners.

A specific implementation of boosted decision trees is the AdaBoost algorithm, which we will talk about later in this chapter.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset