Summary

In this chapter, we reviewed two new classification techniques: KNN and SVM. The goal was to discover how these techniques work and the differences between them by building and comparing models on a common dataset in order to predict if an individual had diabetes. KNN involved both the unweighted and weighted nearest neighbor algorithms. These did not perform as well as the SVMs in predicting whether an individual had diabetes or not.

We examined how to build and tune both the linear and nonlinear support vector machines using the e1071 package. We used the extremely versatile caret package to compare the predictive ability of a linear and nonlinear support vector machine and saw that the nonlinear support vector machine with a sigmoid kernel performed the best.

Finally, we touched on how you can use the caret package to perform a crude feature selection as this is a difficult challenge with a blackbox technique such as SVM. This is a major challenge when using these techniques and you will need to consider how viable they are in order to address the business question.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset