Decision Tree Learning

Decision Tree Learning uses past observations to learn how to classify them and also try to predict the class of a new observation. For example, in a bank, we may have historical information on the granting of loans. Usually, past loan information includes a customer profile and whether the customer defaulted or not. Based on this information, the algorithm can learn to predict whether a new customer will default.

We usually represent a Decision Tree as we did in the following diagram. The root node is at the top, and the leaves of the tree are at the bottom, the leaves represent a decision. In order to create rules from a tree, we need to start from the root node, and then we work downwards, towards the leaves. The following diagram represents a sample Decision Tree:

Decision Tree Learning

After studying the preceding diagram of a Decision Tree, we can obtain these rules:

If Purpose = 'Education' AND Sex = 'male' AND Age > 25 Then No Default
If Purpose = 'Education' AND Sex = 'male' AND Age < 25 Then Yes Default

As you can see, a tree is easy to translate to a set of rules or If then sentences. This is very useful for calculation by Rattle, or any other language, or system, such as Qlik Sense.

Finally, a human expert can understand the rules and the knowledge learned by the algorithm; in this way, a credit manager can understand and review why a computer has classified a loan application as dangerous or not dangerous.

In short, the main advantages of Decision Tree Learning are:

  • The technique is simple
  • It requires little data preparation
  • The result is simple to understand for a human expert
  • It is easy to visually represent

On the other hand, the main disadvantages are:

  • Unstable: A little change in the input data can produce a big change in the output.
  • Overfitting: Sometimes, Decision Tree Learners create very complex trees that do not generalize the data well. In other words, the algorithm learns how to classify the learning dataset, but fails to classify new observations.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset