Summary

This chapter was all about the binary classification problem: true or false and, for our example, the signal indicative of the Higgs-Boson or background noise? We have explored four different algorithms: single decision tree, random forest, gradient boosted machine, and DNN. For this exact problem, DNNs are the current world-beaters as the models can continue to train for longer (that is, increase the number of epochs) and more layers can be added (http://papers.nips.cc/paper/5351-searching-for-higgs-boson-decay-modes-with-deep-learning.pdf)

In addition to exploring four algorithms and how to perform a grid-search against many hyper-parameters, we also looked at some important model metrics to help you better differentiate between models and understand ways to define how good is good. Our goal for this chapter was to expose you to a variety of different algorithms and tweaks within Spark and H2O to solve binary classification problems. In the next chapter, we will explore multi-class classification and how to create ensembles of models (sometimes called super-learners) to arrive at a good solution for our real-world example.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset