In ggplot
, you can include regression lines using
geom_abline()
. For the next example, we set up the same graph of patient height against weight that we have used several times before:
P <- ggplot(T, aes(x = HEIGHT, y = WEIGHT_1)) + geom_point()
As a start, let's calculate the slope and intercept of the line of best fit (regression line) for height against weight before treatment. In Chapter 1, Base Graphics in R – One Step at a Time, in the section entitled Including a regression line, we saw how to include a linear regression line on a graph. Now, we use the lm()
command again to fit a linear regression model by using the following syntax:
lm(WEIGHT_1 ~ HEIGHT, data = T)
Here is the output that you will see on your screen:
Call: lm(formula = WEIGHT_1 ~ HEIGHT, data = T) Coefficients: (Intercept) HEIGHT -123.611 1.166
So, the intercept is approximately -123.61 and the slope is approximately 1.17. You can now include the regression line in the ggplot
graph, as follows:
P + geom_abline(intercept = -123.61, slope = 1.17)
Now we will recreate the graph with regression line, but we also add some descriptive text about the regression using geom_text()
. We will center the text on the point (170, 110).
P + geom_abline(intercept = -123.61, slope = 1.17, col = "red") + geom_text(data = T, aes(170, 110, label = "Slope = 1.17"))
The graph with regression line looks like the following one:
Your text is indeed centered on the point (170, 110). The approach we used to create the regression line was quite straightforward, but it is easier to use stat_smooth()
. This function allows you to use smoothers on your graph, including OLS regressions, generalized linear models, and LOWESS smoothers. You can read further about this function on http://docs.ggplot2.org/0.9.3.1/stat_smooth.html.
In the final examples of this book, we try an OLS regression using the argument method="lm"
. This approach is more efficient than the previous approach, because we can implement it in a single step. First, let's try switching off the standard error using the following command:
P + stat_smooth(method="lm", se=FALSE)
This syntax will give you the following graph:
Next, we switch on the standard error:
P + stat_smooth(method="lm", se=TRUE)
We get the following graph:
Our graph now includes a confidence band whose width is determined by the standard error. The stat_smooth()
function provides a range of smoothers that can be implemented easily using the method
argument.