Now, let's use the decision tree classification algorithm for the common problem that we previously defined to predict whether a customer ends up purchasing a product:
- To do, first, let's instantiate the decision tree classification algorithm and train a model using the training portion of the data that we prepared for our classifiers:
classifier = sklearn.tree.DecisionTreeClassifier(criterion = 'entropy', random_state = 100, max_depth=2)
classifier.fit(X_train, y_train)
- Now, let's use our trained model to predict the labels for the testing portion of our labeled data. Let's generate a confusion matrix that can summarize the performance of our trained model:
import sklearn.metrics as metrics
y_pred = classifier.predict(X_test)
cm = metrics.confusion_matrix(y_test, y_pred)
cm
This gives the following output:
- Now, let's calculate the accuracy, recall, and precision values for the created classifier by using the decision tree classification algorithm:
accuracy= metrics.accuracy_score(y_test,y_pred)
recall = metrics.recall_score(y_test,y_pred)
precision = metrics.precision_score(y_test,y_pred)
print(accuracy,recall,precision)
- Running the preceding code will produce the following output:
The performance measures help us compare different training modeling techniques with each other.