Chapter 4. Plotting

In this chapter, we will explore data with visualization. Pictures help us tell stories better than words can. Like words, pictures can be used to lead or mislead the reader's thinking. The Yahoo! Finance website has a history of each publicly traded stock in the American Stock Exchange in the csv file format. We will demonstrate how to convert a csv file representing a company's closing price history and visualize that history. We will then demonstrate how to compare multiple companies' share prices on the same plot.

In this chapter, we cover the following:

  • Introducing the Haskell library EasyPlot to plot data
  • Simplifying access to data in SQLite3
  • Plotting data from a SQLite3 database
  • Plotting a subset of a dataset
  • Plotting data passed through a function
  • Plotting comparisons of multiple datasets
  • Plotting a moving average
  • Plotting a scatterplot

Plotting data with EasyPlot

Data visualization is the craft of using art to assist the reader in answering questions about data. While it is possible to answer the same questions using only words, pictures will bring data to life in a manner that words cannot. We wish to visualize our datasets using Haskell and SQLite3. To do this, we are going to use the open source tool gnuplot (a popular graphing tool for building visualizations of academic data) and the Haskell interface to gnuplot called EasyPlot. EasyPlot is among the easiest plotting tools to learn in Haskell, with the disadvantage that it is limited in its feature set.

Visualizing data, much like writing about data, requires some criteria for establishing what is (and is not) a good visualization of data. Here are the simple rules that I follow to make my own data visualizations:

  • Does this visualization contribute to the understanding of a dataset? Art is a powerful tool, and, to quote a famous American comic book, "with great power comes great responsibility." It is the job of the analyst to put the data into a context that does not mislead the reader into a false sense of understanding. Some examples of misleading the reader would be cropping out data points that contradict the writer's interpretation of the data, using colors in a manner that allow elements to be confused, or trying to express too much information in a single chart. Some of these design choices might be seen as unethical in certain contexts.
  • Does the visualization help to answer one or more questions? All visualizations should make an attempt at answering questions that the reader might have regarding data. Visualizations give us an opportunity to express variables in a concise manner, which allows us to be expressive about the complexities of the data without having to be burdensome in those same details. The reader should gain at least one insight into the data that the creator of the work did not expect. The goal of a good data visualization is to allow the reader to explore rather than to hamstring the reader into a narrow context.
  • Can the visualization be simplified and still answer the reader's questions? The scientist Carl Sagan wrote about using Occam's Razor in the quest to explain data- this convenient rule-of-thumb urges us, when faced with two hypotheses that explain the data equally well, to choose the simpler. The same could be said of data visualizations; when faced with two visualizations that explain data equally well, choose the simpler visualization. If a visualization is too crowded with information, you are encouraged to split each idea represented into its own visualization.

In this chapter, we will explore the Graphics.EasyPlot package, but to do that we must first install gnuplot and the EasyPlot library. Using the apt-get command in Debian-based Linux distributions, you can download gnuplot using the following:

sudo apt-get install gnuplot

You can also install the Graphics.EasyPlot package in Haskell using the cabal command:

cabal install EasyPlot
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset