Decision Tree Learning uses past observations to learn how to classify them and also try to predict the class of a new observation. For example, in a bank, we may have historical information on the granting of loans. Usually, past loan information includes a customer profile and whether the customer defaulted or not. Based on this information, the algorithm can learn to predict whether a new customer will default.
We usually represent a Decision Tree as we did in the following diagram. The root node is at the top, and the leaves of the tree are at the bottom, the leaves represent a decision. In order to create rules from a tree, we need to start from the root node, and then we work downwards, towards the leaves. The following diagram represents a sample Decision Tree:
After studying the preceding diagram of a Decision Tree, we can obtain these rules:
If Purpose = 'Education' AND Sex = 'male' AND Age > 25 Then No Default If Purpose = 'Education' AND Sex = 'male' AND Age < 25 Then Yes Default
As you can see, a tree is easy to translate to a set of rules or If then sentences. This is very useful for calculation by Rattle, or any other language, or system, such as Qlik Sense.
Finally, a human expert can understand the rules and the knowledge learned by the algorithm; in this way, a credit manager can understand and review why a computer has classified a loan application as dangerous or not dangerous.
In short, the main advantages of Decision Tree Learning are:
On the other hand, the main disadvantages are: