Laplace smoothing

One thing that we haven't mentioned is what happens if a word in the email that you're classifying wasn't in your training set. In order to handle this case, we would need to add a smoothing factor. This is best demonstrated in the following modified code, where the smoothing factor, alpha, is added:

#gives the conditional probability p(B_i | A_x) with smoothing
def conditionalWord(word, spam):
if spam:
return (trainPositive.get(word,0)+alpha)/(float)(positiveTotal+alpha*numWords)
return (trainNegative.get(word,0)+alpha)/(float)(negativeTotal+alpha*numWords)


..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset