Chapter 4. Creating Graphs with ggplot

In the previous chapter, you learned a variety of useful techniques to produce high-quality graphs using qplot. In this chapter, you will learn how to create graphs using ggplot, an even more powerful graphics tool than qplot. In a book of this scope, it is impossible to cover all that ggplot has to offer. Thus, here we learn only the basic methods of ggplot. After reading this chapter, you should be able to create interesting graphics using ggplot. If you wish to read further about ggplot, links to other literature are given in this chapter. The topics covered in this chapter include the following:

  • Setting up variables for plotting
  • Adding color, symbol type, size, and shape as layers
  • Controlling plotting backgrounds and margins
  • Creating line graphs, histograms, bar chats, and boxplots
  • Using attractive color schemes

By the end of this chapter, you should understand the basic principles behind the creation of graphics in ggplot, and will be able to create professional graphs with ggplot.

Note

To assist you in mastering ggplot, I recommend the website http://docs.ggplot2.org/current/.

This website provides links that assist you in a wide range of ggplot techniques.

Getting started with ggplot

You may find that qplot is sufficient to create most of the graphics you want. However, you may need even more options than are provided within qplot, and ggplot may provide those options. Mastering ggplot is somewhat more difficult than qplot, but ggplot does provide more options to control plotting backgrounds, axes and axis labels, legends, grids, and color schemes.

In ggplot, we set up an initial graphing object and then add attributes in steps (which we call layers). Let's start by creating a scatterplot of patient height versus weight before treatment using the medical dataset, which you can copy and paste from the code file for this chapter (available in the code bundle of this book). First note the aes() function (aes is short for the word aesthetics) in which we identify the variables that we wish to include in our graph and in which we set up mappings for color, size, and shape. Also, note the geom_point() function that creates points. Thus, we now set up HEIGHT and WEIGHT_1 as the variables we wish to graph using aes() and then we add the layer geom_point() to create a scatterplot. Later, we can add symbol types (colors, shapes, and sizes, and so on) as new layers. Enter the following syntax, which creates a graphics object:

library(ggplot2)

P <- ggplot(T, aes(x = HEIGHT, y = WEIGHT_1)) + geom_point()
P

Here is the scatterplot of patients' height versus weight:

Getting started with ggplot

The two variables you wished to plot were included within the aes() function and the instruction to plot points (rather than a line) was provided though the geom_point() function. We can include axis labels that record the units of measurement using xlab() and ylab(). We can use this syntax:

P + xlab("HEIGHT (cm)") + ylab("WEIGHT_1 (Kg)")

However, we will use the labs() function instead and we now include a title using labs(title...):

P + labs(x = "HEIGHT (cm)", y = "WEIGHT_1 (Kg)") + labs(title = "WEIGHT vs. HEIGHT_1")

Here is our scatterplot:

Getting started with ggplot

Again, the horizontal and vertical axis labels and the title were added as layers. Let's update the graphics object P so that from now on our graph has a title and axis labels that give the units of measurement. Enter the following syntax:

P <- P + labs(x = "HEIGHT (cm)", y = "WEIGHT_1 (Kg)") + labs(title = "WEIGHT vs. HEIGHT")

At this stage, we may wish to modify the title. Let's set the title to twice the default size and set its color to blue. To do so, we make use of plot.title within the theme() function, which allows you to modify theme settings. We also make use of the function element_text(), which allows you to modify color, size, font, and other attributes of your text. In the following syntax, we increase the font size using size = rel():

P + theme(plot.title = element_text(size = rel(2), color = "blue"))

This syntax gives us the following scatterplot:

Getting started with ggplot

You can see that some complex syntax was required. However, now that you know the syntax, you can use it to modify titles in your own graphs. Further information on the themes available in ggplot is given in various texts and online sources. A very good resource is available at http://docs.ggplot2.org/current/theme.html.

In the Producing scatterplots using qplot section in Chapter 3, Mastering the qplot Function, we saw how to set aesthetics in qplot. In ggplot, the aesthetics are set within a function that also controls the graph type. For example, we have geom_point() for scatterplots, geom_bar() for bar graphs, and geom_histogram() for histograms. In the following syntax, we use points, and then set the symbol color to dark green and the symbol size to the value 5:

P + geom_point(color = "darkgreen", size = 5) 

Here is the resulting scatterplot:

Getting started with ggplot
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset