It is important to understand Bayes' theorem before diving into the classifier. Let A and B denote two events. Events could be that it will rain tomorrow; 2 kings are drawn from a deck of cards; or a person has cancer. In Bayes' theorem, P(A |B) is the probability that A occurs given that B is true. It can be computed as follows:
Here, P(B|A) is the probability of observing B given that A occurs, while P(A) and P(B) are the probability that A and B occur, respectively. Too abstract? Let's look at some of the following concrete examples:
- Example 1: Given two coins, one is unfair with 90% of flips getting a head and 10% getting a tail, while the other one is fair. Randomly pick one coin and flip it. What is the probability that this coin is the unfair one, if we get a head?
We solve it by first denoting U for the event of picking the unfair coin, F for the fair coin, and H for the event of getting a head. So the probability that the unfair has been picked when we get a head, P(U|H) can be calculated with the following:
As we know P(H|U) is 90% . P(U) is 0.5 because we randomly pick a coin out of two. However, deriving the probability of getting a head P(H) is not that straightforward, as two events can lead to the following, where U is when the unfair one is picked and F is when the fair coin is picked:
So P(U |H) becomes the following:
- Example 2: Suppose a physician reported the following cancer screening test scenario among 10,000 people:
Cancer | No cancer | Total | |
Text positive | 80 | 900 | 980 |
Text negative | 20 | 9,000 | 9,020 |
Total | 100 | 9,900 | 10,000 |
It indicates for example 80 out of 100 cancer patients are correctly diagnosed, while the other 20 are not; cancer is falsely detected in 900 out of 9,900 healthy people.
If the result of this screening test on a person is positive, what is the probability that they actually has cancer?
Let's assign the event of having cancer and positive testing result as C and Pos respectively. Apply Bayes' theorem to calculate P(C|Pos):
Given a positive screening result, the chance that the subject has cancer is 8.16%, which is significantly higher than the one under general assumption (100/10000=1%) without undergoing the screening.
- Example 3: Three machines A, B, and C in a factory account for 35%, 20%, and 45% of the bulb production. And the fraction of defective bulbs produced by each machine is 1.5%, 1%, and 2% respectively. A bulb produced by this factory was identified defective, which is denoted as event D. What are the probabilities that this bulb was manufactured by machine A, B, and C respectively?
Again, simply just follow Bayes' theorem, as follows:
Also, either way, we do not even need to calculate P(D) since we know that the following is the case:
We too know the following concept:
So we have the following formula:
After making sense of Bayes' theorem as the backbone of Naïve Bayes, we can easily move forward with the classifier itself.