Types of machine learning

There are many ways to segment machine learning and dive deeper. In Chapter 1, How to Sound Like a Data Scientist, I mentioned statistical and probabilistic models. These models utilize statistics and probability, which we've seen in the previous chapters, in order to find relationships between data and make predictions. In this chapter, we will implement both types of models. In the following chapter, we will see machine learning outside the rigid mathematical world of statistics/probability. You can segment machine learning models by different characteristics, including the following:

  • The types of data/organic structures they utilize (tree/graph/neural network)
  • The field of mathematics they are most related to (statistical/probabilistic)
  • The level of computation required to train (deep learning)

For the purpose of education, I will offer my own breakdown of machine learning models. Branching off from the top level of machine learning, there are the following three subsets:

  • Supervised learning
  • Unsupervised learning
  • Reinforcement learning

Supervised learning

Simply put, supervised learning finds associations between features of a dataset and a target variable. For example, supervised learning models might try to find the association between a person's health features (heart rate, obesity level, and so on) and that person's risk of having a heart attack (the target variable).

These associations allow supervised models to make predictions based on past examples. This is often the first thing that comes to people's minds when they hear the phrase, machine learning, but it in no way does it encompass the realm of machine learning. Supervised machine learning models are often called predictive analytics models, named for their ability to predict the future based on the past.

Supervised machine learning requires a certain type of data called labeled data. This means that we must teach our model by giving it historical examples that are labeled with the correct answer. Recall the facial recognition example. That is a supervised learning model because we are training our model with the previous pictures labeled as either face or not face, and then asking the model to predict whether or not a new picture has a face in it.

Specifically, supervised learning works using parts of the data to predict another part. First, we must separate data into two parts, as follows:

  • The predictors, which are the columns that will be used to make our prediction. These are sometimes called features, input values, variables, and independent variables.
  • The response, which is the column that we wish to predict. This is sometimes called outcome, label, target, and dependent variable.

Supervised learning attempts to find a relationship between the predictors and the response in order to make a prediction. The idea is that, in the future, a data observation will present itself and we will only know the predictors. The model will then have to use the predictors to make an accurate prediction of the response value.

Example – heart attack prediction

Suppose we wish to predict whether someone will have a heart attack within a year. To predict this, we are given that person's cholesterol level, blood pressure, height, their smoking habits, and perhaps more. From this data, we must ascertain the likelihood of a heart attack. Suppose, to make this prediction, we look at previous patients and their medical history. As these are previous patients, we know not only their predictors (cholesterol, blood pressure, and so on), but we also know if they actually had a heart attack (because it already happened!).

This is a supervised machine learning problem because we are doing the following:

  • We are making a prediction about someone
  • We are using historical training data to find relationships between medical variables and heart attacks:
    Example – heart attack prediction

    An overview of supervised models

The hope here is that a patient will walk in tomorrow and our model will be able to identify whether or not the patient is at risk for a heart attack based on her/his conditions (just like a doctor would!).

As the model sees more and more labeled data, it adjusts itself in order to match the correct labels given to us. We can use different metrics (explained later in this chapter) to pinpoint exactly how well our supervised machine learning model is doing and how it can better adjust itself.

One of the biggest drawbacks of supervised machine learning is that we need this labeled data, which can be very difficult to get a hold of. Suppose we wish to predict heart attacks; we might need thousands of patients along with all of their medical information and years' worth of follow-up records for each person, which could be a nightmare to obtain.

In short, supervised models use historical labeled data in order to make predictions about the future. Some possible applications for supervised learning include the following:

  • Stock price predictions
  • Weather predictions
  • Crime predictions

Note how each of the preceding examples uses the word prediction, which makes sense seeing how I emphasized supervised learning's ability to make predictions about the future. Predictions, however, are not where the story ends.

Here is a visualization of how supervised models use labeled data to fit themselves and prepare themselves to make predictions:

Example – heart attack prediction

Note how the supervised model learns from a bunch of training data and then, when it is ready, it looks at unseen cases and outputs a prediction.

It's not only about predictions

Supervised learning exploits the relationship between the predictors and the response to make predictions, but sometimes, it is enough just knowing that there even is a relationship. Suppose we are using a supervised machine learning model to predict whether or not a customer will purchase a given item. A possible dataset might look as follows:

Person ID

Age

Gender

Employed?

Bought the product?

1

63

F

N

Y

2

24

M

Y

N

Note that, in this case, our predictors are Age, Gender, and Employed?, while our response is Bought the product? This is because we want to see if, given someone's age, gender, and employment status, they will buy the product.

Assume that a model is trained on this data and can make accurate predictions about whether or not someone will buy something. That, in and of itself, is exciting, but there's something else that is arguably even more exciting. The fact that we could make accurate predictions implies that there is a relationship between these variables, which means that to know if someone will buy your product, you only need to know their age, gender, and employment status! This might contradict the previous market research, indicating that much more must be known about a potential customer to make such a prediction.

This speaks to supervised machine learning's ability to understand which predictors affect the response and how. For example, are women more likely to buy the product, which age groups are prone to decline the product, is there a combination of age and gender that is a better predictor than any one column on its own? As someone's age increases, do their chances of buying the product go up, down, or stay the same?

It is also possible that all of the columns are not necessary. A possible output of a machine learning might suggest that only certain columns are necessary to make the prediction and that the other columns are only noise (they do not correlate to the response and therefore confuse the model).

Types of supervised learning

There are, in general, two types of supervised learning models: regression and classification. The difference between the two is quite simple and lies in the response variable.

Regression

Regression models attempt to predict a continuous response. This means that the response can take on a range of infinite values. Consider the following examples:

  • Dollar amounts
    • Salary
    • Budget
  • Temperature
  • Time
    • Generally recorded in seconds or minutes

Classification

Classification attempts to predict a categorical response, which means that the response only has a finite amount of choices. Here are some examples:

  • Cancer grade (1, 2, 3, 4, 5)
  • True/false questions, such as the following examples:
    • "Will this person have a heart attack within a year?"
    • "Will you get this job?"
  • Given a photo of a face, who does this face belong to? (facial recognition)
  • Predict the year someone was born:
    • Note that there are many possible answers (over 100) but still finitely many more

Example – regression

The following graphs show a relationship between three categorical variables (age, year they were born, and education level) and a person's wage:

Classification

Regression examples (source: https://lagunita.stanford.edu/c4x/HumanitiesScience/StatLearning/asset/introduction.pdf)

Note that, even though each predictor is categorical, this example is regressive because the y axis, our dependent variable, our response, is continuous.

Our earlier heart attack example is classification because the response was: will this person have a heart attack within a year? This has only two possible answers: Yes or No.

Data is in the eyes of the beholder

Sometimes, it can be tricky to decide whether or not you should use classification or regression. Consider that we are interested in the weather outside. We could ask the question, how hot is it outside? In this case, your answer is on a continuous scale, and some possible answers are 60.7 degrees or 98 degrees. However, as an exercise, go and ask 10 people what the temperature is outside. I guarantee you that someone (if not most people) will not answer in some exact degrees but will bucket their answer and say something like it's in the 60s.

We might wish to consider this problem as a classification problem, where the response variable is no longer in exact degrees but is in a bucket. There would only be a finite number of buckets in theory, making the model perhaps learn the differences between 60s and 70s a bit better.

Unsupervised learning

The second type of machine learning does not deal with predictions but has a much more open objective. Unsupervised learning takes in a set of predictors and utilizes relationships between the predictors in order to accomplish tasks such as the following:

  • It reduces the dimension of the data by condensing variables together. An example of this would be file compression. Compression works by utilizing patterns in the data and representing the data in a smaller format.
  • It finds groups of observations that behave similarly and groups them together.

The first element on this list is called dimension reduction and the second is called clustering. Both of these are examples of unsupervised learning because they do not attempt to find a relationship between predictors and a specific response and therefore are not used to make predictions of any kind. Unsupervised models, instead, are utilized to find organizations and representations of the data that were previously unknown.

The following screenshot is a representation of a cluster analysis:

Unsupervised learning

Example of cluster analysis

The model will recognize that each uniquely colored cluster of observations is similar to another but different from the other clusters.

A big advantage for unsupervised learning is that it does not require labeled data, which means that it is much easier to get data that complies with unsupervised learning models. Of course, a drawback to this is that we lose all predictive power because the response variable holds the information to make predictions and, without it, our model will be hopeless in making any sort of predictions.

A big drawback is that it is difficult to see how well we are doing. In a regression or classification problem, we can easily tell how well our models are predicting by comparing our models' answers to the actual answers. For example, if our supervised model predicts rain and it is sunny outside, the model was incorrect. If our supervised model predicts the price will go up by 1 dollar and it goes up by 99 cents, our model was very close! In supervised modeling, this concept is foreign because we have no answer to compare our models to. Unsupervised models are merely suggesting differences and similarities, which then require a human's interpretation:

Unsupervised learning

An overview of unsupervised models

In short, the main goal of unsupervised models is to find similarities and differences between data observations. We will discuss unsupervised models in depth in later chapters.

Reinforcement learning

In reinforcement learning, algorithms get to choose an action in an environment and then are rewarded (positively or negatively) for choosing this action. The algorithm then adjusts itself and modifies its strategy in order to accomplish some goal, which is usually to get more rewards.

This type of machine learning is very popular in AI-assisted gameplay as agents (the AI) are allowed to explore a virtual world and collect rewards and learn the best navigation techniques. This model is also popular in robotics, especially in the field of self-automated machinery, including cars:

Reinforcement learning

Self-driving cars (image source: https://www.quora.com/How-do-Googles-self-driving-cars-work)

Self-driving cars read in sensor input, act accordingly, and are then rewarded for taking a certain action. The car then adjusts its behavior to collect more rewards. It can be thought that reinforcement is similar to supervised learning in that the agent is learning from its past actions to make better moves in the future; however, the main difference lies in the reward. The reward does not have to be tied in any way to a correct or incorrect decision. The reward simply encourages (or discourages) different actions.

Reinforcement learning is the least explored of the three types of machine learning and therefore is not explored in great length in this text. The remainder of this chapter will focus on supervised and unsupervised learning.

Overview of the types of machine learning

Of the three types of machine learning—supervised, unsupervised, and reinforcement learning—we can imagine the world of machine learning as something like this:

Overview of the types of machine learning

Each of the three types of machine learning has its benefits and its drawbacks, as listed:

  • Supervised machine learning: This exploits relationships between predictors and response variables to make predictions of future data observations. The pros are as follows:
    • It can make future predictions
    • It can quantify relationships between predictors and response variables
    • It can show us how variables affect each other and how much

    The cons are as follows:

    • It requires labeled data (which can be difficult to get)
  • Unsupervised machine learning: This finds similarities and differences between data points. The pros are as follows:
    • It can find groups of data points that behave similarly that a human would never have noted.
    • It can be a preprocessing step for supervised learning.Think of clustering a bunch of data points and then using these clusters as the response.
    • It can use unlabeled data, which is much easier to find.

    The cons are as follows:

    • It has zero predictive power
    • It can be hard to determine if we are on the right track
    • It relies much more on human interpretation
  • Reinforcement learning: This is reward-based learning that encourages agents to take particular actions in their environments. The pros are as follows:
    • Very complicated rewards systems create very complicated AI systems
    • It can learn in almost any environment, including our own Earth

    The cons are as follows:

    • The agent is erratic at first and makes many terrible choices before realizing that these choices have negative rewards

    For example, a car might crash into a wall and not know that that is not okay until the environment negatively rewards it

    • It can take a while before the agent avoids decisions altogether
    • The agent might play it safe and only choose one action and be "too afraid" to try anything else for fear of being punished
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset