Chapter 7. Supervised and Unsupervised Learning

The amount of data collected for various purposes in society has increased enormously in the last few decades. Machine learning is a way of making sense of all this data by leveraging what we know about the data. In the generalized picture of machine learning, the computer first learns from a given dataset (training) and creates a generalized model to represent it. With this model, it is possible to predict various outcomes, results, and groupings (classes). In this chapter, we will cover the following topics:

  • Linear regression with machine learning algorithms
  • Clustering with machine learning algorithms
  • Feature selection—a preprocessing method to select what is most important
  • Classification with different machine learning algorithms and kernels

Before getting started, I will give you a brief introduction to machine learning and the package that we will use: Scikit-learn.

Introduction to machine learning

There are three main categories of machine learning: supervised, unsupervised, and reinforced. Given a simple dataset with input x and output y, supervised learning is when both x and y have known labels. The algorithm maps x to y and after training, it can predict y values with x as input. Contrary to this, unsupervised learning is when only x is labeled and the algorithm finds a label for y itself. Reinforced learning is when the computer learns without the need to map the input to an outcome and instead responds to the input. This is how algorithms that play chess or other games work.

They try to predict how to react to input without a clearly quantifiable outcome, instead seeking reinforcement; one example being to play the game continuously until it ends without making a mistake (that is, win). One feature-rich and popular package for machine learning in Python is Scikit-learn.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset