Motivation

Ensemble learning aims to solve the problems of bias and variance. By combining many models, we can reduce the ensemble's error, while retaining the individual models' complexities. As we saw earlier, there is a certain lower limit imposed on each model error, which is related to the model complexity.

Furthermore, we mentioned that the same algorithm can produce different models, due to the initial conditions, hyperparameters, and other factors. By combining different, diverse models, we can reduce the expected error of the group, while each individual model remains unchanged. This is due to statistics, rather than pure learning.

In order to better demonstrate this, let's consider an ensemble of 11 base learners for a classification, each with a probability of misclassification (error) equal to err=0.15 or 15%. Now, we want to create a simple ensemble. We always assume that the output of most base learners is the correct answer. Assuming that they are diverse (in statistics, uncorrelated), the probability that the majority of them is wrong is 0.26%:

As is evident, the more base learners we add to the ensemble, the more accurate the ensemble will be, under the condition that each learner is uncorrelated to the others. Of course, this is increasingly difficult to achieve. Furthermore, the law of diminishing returns applies. Each new uncorrelated base learner contributes less to the overall error reduction than the previously added base learner. The following figure shows the ensemble error percentage for a number of uncorrelated base learners. As is evident, the greatest reduction is applied when we add two uncorrelated base learners:

The relation between the number of base learners and the ensemble error
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset