In this section, we will address the issue of confidence intervals. A confidence interval allows us to make a probabilistic estimate of the value of the mean of a population's given sample data.
This estimate, called an interval estimate, consists of a range of values (intervals) that act as good estimates of the unknown population parameter.
The confidence interval is bounded by confidence limits. A 95 percent confidence interval is defined as an interval in which the interval contains the population mean with a 95 percent probability. So how do we construct a confidence interval?
Suppose we have a 2-tailed t-test and we want to construct a 95 percent confidence interval. In this case, we want the sample t-value, , corresponding to the mean to satisfy the following inequality:
Given that , we can substitute this in the preceding inequality relation to obtain the following equation:
The interval is our 95 percent confidence interval.
Generalizing any confidence interval for any percentage, y, can be expressed as , where is the t-tailed value of t—that is, correlation to the desired confidence interval for y.
We will now take the opportunity to illustrate how we can calculate the confidence interval using a dataset from the popular statistical environment known as R. The stats models' module provides access to the datasets that are available in the core datasets package of R through the get_rdataset function.