Classifying the data with a Naive Bayes classifier

The following steps will help you build a Naive Bayes classifier:

We can compare the result to a true Naive Bayes classifier by asking scikit-learn for help:

In [13]: from sklearn import naive_bayes
...      model_naive = naive_bayes.GaussianNB()

As usual, training the classifier is done via the fit method:

In [14]: model_naive.fit(X_train, y_train)
Out[14]: GaussianNB(priors=None)

Scoring the classifier is built in:

In [15]: model_naive.score(X_test, y_test)
Out[15]: 1.0

Again a perfect score! However, in contrast to OpenCV, this classifier's predict_proba method returns true probability values, because all values are between 0 and 1 and because all rows add up to 1:

In [16]: yprob = model_naive.predict_proba(X_test)
...      yprob.round(2)
Out[16]: array([[ 0.,  1.],
                [ 0.,  1.],
                [ 0.,  1.],
                [ 1.,  0.],
                [ 1.,  0.],
                [ 1.,  0.],
                [ 0.,  1.],
                [ 0.,  1.],
                [ 1.,  0.],
                [ 1.,  0.]])

You might have noticed something else: This classifier has absolutely no doubt about the target label of each and every data point. It's all or nothing.

The decision boundary returned by the Naive Bayes classifier looks slightly different, but can be considered identical to the previous command for the purpose of this exercise:

In [17]: plot_decision_boundary(model_naive, X, y)

The output looks like this:

The preceding screenshot shows a decision boundary of a Naive Bayes classifier.

Table of Contents for Classifying the data with a Naive Bayes classifier

Create new playlist

Sign In

Sign Up

Table of Contents for
Classifying the data with a Naive Bayes classifier