Summary

In this chapter, we discussed Random Forests, an ensemble method utilizing decision trees as its base learners. We presented two basic methods of constructing the trees: the conventional Random Forests approach, where a subset of features is considered at each split, as well as Extra Trees, where the split points are chosen almost randomly. We discussed the basic characteristics of the ensemble method. Furthermore, we presented regression and classification examples using the scikit-learn implementations of Random Forests and Extra Trees. The key points of this chapter that summarize its contents are provided below.

Random Forests use bagging in order to create train sets for their base learners. At each node, each tree considers only a subset of the available features and computes the optimal feature/split point combination. The number of features to consider at each point is a hyper-parameter that must be tuned. Good starting points are as follows:

  • The square root of the total number of parameters for classification problems
  • One-third of the total number of parameters for regression problems

Extra trees and random forests use the whole dataset for each base learner. In extra trees and random forests, instead of calculating the optimal feature/split-point combination of the feature subset at each node, a random split point is generated for each feature in the subset and the best is selected. Random forests can give information regarding the importance of each feature. Although relatively resistant to overfitting, random forests are not immune to it. Random forests can exhibit high bias when the ratio of relevant to irrelevant features is low. Random forests can exhibit high variance, although the ensemble size does not contribute to the problem. In the next chapter, we will present ensemble learning techniques that can be applied to unsupervised learning methods (clustering).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset