Summary

Naive Bayes is a simple theorem that can produce a robust classifier. With a little bit of simple math, we have analyzed the frequency of words used across several thousands of tweets and used that analysis in the creation of a language classifier. With a larger database, we could handle more complex phrases. We also learned how to pull data from Twitter's REST API and learned the powerful features of the HashMap library.

The next chapter looks at another tool in the data analyst's toolkit – Principal Component Analysis (PCA). The math behind PCA has been used to produce recommendation engines for websites such as Amazon and Netflix. We will continue to use our Twitter database in the next chapter to create a recommendation engine of our own.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset