Including regression lines

In ggplot, you can include regression lines using geom_abline(). For the next example, we set up the same graph of patient height against weight that we have used several times before:

P <- ggplot(T, aes(x = HEIGHT, y = WEIGHT_1)) + geom_point()

As a start, let's calculate the slope and intercept of the line of best fit (regression line) for height against weight before treatment. In Chapter 1, Base Graphics in R – One Step at a Time, in the section entitled Including a regression line, we saw how to include a linear regression line on a graph. Now, we use the lm() command again to fit a linear regression model by using the following syntax:

lm(WEIGHT_1 ~ HEIGHT, data = T)

Here is the output that you will see on your screen:

Call:
lm(formula = WEIGHT_1 ~ HEIGHT, data = T)
Coefficients:
(Intercept)       HEIGHT  
   -123.611        1.166  

So, the intercept is approximately -123.61 and the slope is approximately 1.17. You can now include the regression line in the ggplot graph, as follows:

P + geom_abline(intercept = -123.61, slope = 1.17)

Now we will recreate the graph with regression line, but we also add some descriptive text about the regression using geom_text(). We will center the text on the point (170, 110).

P + geom_abline(intercept = -123.61, slope = 1.17, col = "red") +
geom_text(data = T, aes(170, 110, label = "Slope = 1.17"))

The graph with regression line looks like the following one:

Including regression lines

Your text is indeed centered on the point (170, 110). The approach we used to create the regression line was quite straightforward, but it is easier to use stat_smooth(). This function allows you to use smoothers on your graph, including OLS regressions, generalized linear models, and LOWESS smoothers. You can read further about this function on http://docs.ggplot2.org/0.9.3.1/stat_smooth.html.

In the final examples of this book, we try an OLS regression using the argument method="lm". This approach is more efficient than the previous approach, because we can implement it in a single step. First, let's try switching off the standard error using the following command:

P + stat_smooth(method="lm", se=FALSE)

This syntax will give you the following graph:

Including regression lines

Next, we switch on the standard error:

P + stat_smooth(method="lm", se=TRUE) 

We get the following graph:

Including regression lines

Our graph now includes a confidence band whose width is determined by the standard error. The stat_smooth() function provides a range of smoothers that can be implemented easily using the method argument.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset