Word clouds give greater prominence to words that appear more frequently in any given text. They are also called tag clouds or weighted words. The significance of a word's strength in terms of its number of occurrences visually maps to the size of its appearance. In other words, the word that appears the largest in visualization is the one that has appeared the most in the text.
Beyond showing the occurrences of the words in shapes and colors, word clouds have several useful applications for social media and marketing as follows:
In order to create a word cloud, one can write the Python code or use something that already exists. Andreas Mueller from NYU Center for Data Science created a word cloud in Python. This is pretty simple and easy to use. The RemachineScript.ttf
font file can be downloaded from http://www.fonts101.com/fonts/view/Script/63827/Remachine_Script.
STOPWORDS
consist of extremely common words, for example a
, an
, the
, is
, was
, at
, in
, and many more. The following code creates a word cloud using a list of STOPWORDS
in order to ignore them:
from wordcloud import WordCloud, STOPWORDS import matplotlib.pyplot as plt from os import path d = path.dirname("__file__") text = open(path.join(d, '/Users/MacBook/kirthi/results.txt')).read() wordcloud = WordCloud( font_path='/Users/MacBook/kirthi/RemachineScript.ttf', stopwords=STOPWORDS, background_color='#222222', width=1000, height=800).generate(text)
In order to plot this, first set the figure size and use imshow()
that will display the word cloud as an image.
# Open a plot of the generated image. plt.figure(figsize=(13,13)) plt.imshow(wordcloud) plt.axis("off") plt.show()
To summarize, we will first extract the sentiments from the TextBlob
example and assume that the extracted results are in results.txt
. Then, we will use these words to visualize data as a word cloud with the matplotlib
package.
The results of wordcloud
are shown in the following image: