Creating forests

By creating a number of trees using any valid randomization method, we have essentially created a forest, hence the algorithm's name. After generating the ensemble's trees, their predictions must be combined in order to have a functional ensemble. This is usually achieved through majority voting for classification problems and through averaging for regression problems. There are a number of hyperparameters associated with Random Forests, such as the number of features to consider at each node split, the number of trees in the forest, and the individual tree's size. As mentioned earlier, a good starting point for the number of features to consider is as follows:

The square root of the number of total features for classification problems
One-third of the number of total features for regression problems

The total number of trees can be fine-tuned by hand, as the ensemble's error converges to a limit when this number increases. Out-of-bag errors can be utilized to find an optimal value. Finally, the size of each tree can be a deciding factor in overfitting. Thus, if overfitting is observed, the tree size should be reduced.

Table of Contents for Creating forests

Create new playlist

Sign In

Sign Up

Table of Contents for
Creating forests