Creating bar charts

Bar charts are useful for comparing the numbers of elements within subgroups of a population. However, they can be used for other purposes, such as comparing the means of a continuous variable across the levels of a categorical variable. You can create bar charts in ggplot using geom_bar(). As an exercise, create a bar chart of numbers of patients by ethnicity by turning the variable ETH into a factor by using factor(). The syntax is as follows:

 W <- ggplot(T, aes(factor(ETH))) +  geom_bar() 
 W

The height of each bar gives the number of patients within each ethnicity. As an exercise, you can create a horizontal bar chart by adding the layer coord_flip(). The coord_flip() layer also works for other types of graph, including scatterplots and bar charts.

Now we insert our choice of color and border color using fill and color. Let's have an ivory color for the bars, along with dark green borders. The syntax is as follows:

W + geom_bar(fill="ivory", color="darkgreen") 

This syntax gives the following bar chart:

Creating bar charts

Creating a stacked bar chart

Now we will see how to create a stacked bar chart of a categorical variable, partitioned by the levels of another categorical variable. Let's plot the numbers of patients receiving each treatment, partitioned by the two levels of RECOVER. Simply insert both variables within aes(), mapping a color to one of them using fill and choosing our own colors using scale_fill_manual(). This time, we choose colors by entering colors(distinct = FALSE) on the command line and selecting from the list returned by R:

ggplot(T, aes(TREATMENT, fill=factor(RECOVER))) + geom_bar() + scale_fill_manual(values = c("springgreen3", " lightsalmon1"))

Here is the resulting bar chart:

Creating a stacked bar chart

The label 0 represents patients who did not recover and the label 1 represents those who did recover. Thus, the stacked bar chart suggests that treatment A was the most effective, while treatment B was the least effective.

We can try faceting this bar chart to create separate charts for those who recover and those who do not. We use the syntax facet_wrap(~ RECOVER) in order to create separate graphs for each level of RECOVER. The facet_wrap() function is covered in more detail in the section entitled Creating a faceted bar chart. We choose our colors from the Hexadecimal Color Chart using scale_fill_manual(). Here, we subset for smokers only. The syntax is as follows

ggplot(subset(T, SMOKE == "Y"), aes(TREATMENT, fill= factor(RECOVER))) + geom_bar()+ facet_wrap(~ RECOVER) + scale_fill_manual(values = c("#669933", "#FFCC33"))

Now the bar chart looks like this:

Creating a stacked bar chart

This bar chart gives us the required information, partitioned into two separate charts—one for each level of RECOVER.

Creating a grouped bar chart

We can present the same information using grouped bar chart. To do so, we use the argument position = "dodge" argument, again within geom_bar(). Again, we choose our colors from the Hexadecimal Color Chart using scale_fill_manual(). We subset for those who exercise. The syntax is as follows:

ggplot(subset(T, EXERCISE == "TRUE"), aes(TREATMENT, fill= factor(RECOVER))) + geom_bar(position="dodge") + scale_fill_manual(values = c("#6666FF", "#669900"))

You will get this bar chart:

Creating a grouped bar chart

All patients who exercised and received treatment A eventually recovered. You can verify this result by examining this particular subset, as follows:

subset(T, EXERCISE == "TRUE" & TREATMENT == "A")

The output is as follows:

Creating a grouped bar chart

Creating a faceted bar chart

As a more complex example in which we include even more information, let's try a faceted bar chart of the numbers of patients receiving each treatment. However, the bar chart is now partitioned by both gender and stacked according to whether or not the patient recovered.

In fact, ggplot provides two functions to create facet plots. We use facet_grid() to split a variable by the levels of one or more categorical variables so that the graphs for each level are placed together, arranged either horizontally or vertically. We use facet_wrap() to position the facet plots together in your chosen number of rows and columns. For further information on these two functions, visit the following websites:

Let's use facet_grid() on TREATMENT, faceted by the two levels of RECOVER:

ggplot(T, aes(TREATMENT, fill=factor(RECOVER))) + geom_bar() + facet_grid(. ~ GENDER) + scale_fill_manual(values = c("#339999","#CC9900"))

This syntax produces the following faceted bar chart:

Creating a faceted bar chart

This graph presents a lot of useful information at once. Partitioning by gender allows us to compare patient recovery within and across the two genders and also within and across treatment levels.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset