Summary

In this chapter, we presented the basic datasets, algorithms, and metrics that we will use throughout the book. We talked about regression and classification problems, where datasets have not only features but also targets. We called these labeled datasets. We also talked about unsupervised learning, in the form of clustering and dimensionality reduction. We introduced cost functions and model metrics that we will use to evaluate the models that we generate. Furthermore, we presented the basic learning algorithms and Python libraries that we will utilize in the majority of our examples.

In the next chapter, we will introduce the concepts of bias and variance, as well as the concept of ensemble learning. Some key points to remember are as follows:

  • We try to solve a regression problem when the target variable is a continuous number and its values have a meaning in terms of magnitude, such as speed, cost, blood pressure, and so on. Classification problems can have their targets coded as numbers, but we cannot treat them as such. There is no meaning in trying to sort colors or foods based on the number they are assigned during a problem's encoding.
  • Cost functions are a way to quantify how far away a predictive model is from modelling data perfectly. Metrics provide information that is easier for humans to understand and report.
  • All of the algorithms presented in this chapter have implementations for both classification and regression problems in scikit-learn. Some are better suited to particular tasks, at least without tuning their hyper parameters. Decision trees produce models that are easily interpreted by humans.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset