Sketching our roadmap

Sentiment analysis of tweets is particularly hard, because of Twitter's size limitation per message. This leads to a special syntax, creative abbreviations, and seldom-well-formed sentences. The typical approach of analyzing sentences, aggregating their sentiment information per paragraph, and then calculating the overall sentiment of a document does not work here.

Clearly, we will not try to build a state-of-the-art sentiment classifier. Instead, we want to do the following:

  • Use this scenario as a vehicle to introduce yet another classification algorithm, Naïve Bayes
  • Explain how Part Of Speech (POS) tagging works and how it can help us
  • Show some more tricks from the scikit-learn toolbox that can come in handy
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset