Classifying the data with a Naive Bayes classifier

The following steps will help you build a Naive Bayes classifier:

  1. We can compare the result to a true Naive Bayes classifier by asking scikit-learn for help:
In [13]: from sklearn import naive_bayes
... model_naive = naive_bayes.GaussianNB()
  1. As usual, training the classifier is done via the fit method:
In [14]: model_naive.fit(X_train, y_train)
Out[14]: GaussianNB(priors=None)
  1. Scoring the classifier is built in:
In [15]: model_naive.score(X_test, y_test)
Out[15]: 1.0
  1. Again a perfect score! However, in contrast to OpenCV, this classifier's predict_proba method returns true probability values, because all values are between 0 and 1 and because all rows add up to 1:
In [16]: yprob = model_naive.predict_proba(X_test)
... yprob.round(2)
Out[16]: array([[ 0., 1.],
[ 0., 1.],
[ 0., 1.],
[ 1., 0.],
[ 1., 0.],
[ 1., 0.],
[ 0., 1.],
[ 0., 1.],
[ 1., 0.],
[ 1., 0.]])

You might have noticed something else: This classifier has absolutely no doubt about the target label of each and every data point. It's all or nothing.

  1. The decision boundary returned by the Naive Bayes classifier looks slightly different, but can be considered identical to the previous command for the purpose of this exercise:
In [17]: plot_decision_boundary(model_naive, X, y)

The output looks like this:

The preceding screenshot shows a decision boundary of a Naive Bayes classifier.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset