Optimizing the decision tree

We can try to optimize the tree's depth in order to maximize F1 or recall. In order to do so, we will experiment with depths in the range of [3, 11] on the train set.

The following graph depicts the F1 score and recall for the various maximum depths, both for the original and filtered datasets:

Test metrics for various tree depths

Here, we observe that for a maximum depth of 5, F1 and recall are optimized for the filtered dataset. Furthermore, recall is optimized for the original dataset as well. We will continue with a maximum depth of 5 as trying to further optimize the metrics can lead to overfitting, especially since the number of instances relevant to the metrics is extremely small. Furthermore, with a maximum depth of 5, there is an improvement both in F1, as well as in recall, when the filtered dataset is used.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset