How to do it...

Perform the following steps:

First, we visualize the attribute, mpg, against am using a boxplot:

        > boxplot(mtcars$mpg, mtcars$mpg[mtcars$am==0], ylab = "mpg",
        names=c("overall","automobile"))
        > abline(h=mean(mtcars$mpg),lwd=2, col="red")
        > abline(h=mean(mtcars$mpg[mtcars$am==0]),lwd=2, col="blue")

The boxplot of mpg of the overall population and automobiles

We then perform a statistical procedure to validate whether the average mpg of automobiles is lower than the average of the overall mpg:

        > mpg.mu = mean(mtcars$mpg)
        > mpg_am = mtcars$mpg[mtcars$am == 0]
        > t.test(mpg_am,mu = mpg.mu)

We begin visualizing the data by plotting a boxplot:

        > boxplot
        (mtcars$mpg~mtcars$am,ylab='mpg',names=c('automatic','manual'))
        > abline(h=mean(mtcars$mpg[mtcars$am==0]),lwd=2, col="blue")
        > abline(h=mean(mtcars$mpg[mtcars$am==1]),lwd=2, col="red")

The boxplot of mpg of automatic and manual transmission cars

The preceding figure reveals that the mean mpg of automatic transmission cars is lower than the average mpg of manual transmission vehicles:

> t.test(mtcars$mpg~mtcars$am)
Output:
  
      Welch Two Sample t-test

data: mtcars$mpg by mtcars$am
t = -3.7671, df = 18.332, p-value = 0.001374
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -11 .280194 -3.209684
sample estimates:
mean in group 0 mean in group 1 
       17.14737 24.39231

The next recipe will let you create your own data and perform the t-test. Think of 60 students divided into two equal divisions: A and B of 30 each. Perform the following steps in R:

> data = data.frame(marks=sample(40:100, 60, replace=TRUE),division=c(re
p('A',30), rep('B',30)))
> head(data)
> boxplot(data$marks, data$marks[data$division=='A'], ylab="Marks", names
=c("All Marks", "Div A"))
> abline(h=mean(data$marks), lwd=2, col="red")
> abline(h=mean(data$marks[data$division=='A']), lwd=2, col="blue")

>meanmarks = mean(data$marks)
>marksA = data$marks[data$division=='A']
>t.test(marksA, mu = meanmarks)
Output:
 One Sample t-test
data: marksA
t = -0.80284, df = 29, p-value = 0.4286
alternative hypothesis: true mean is not equal to 72.5
95 percent confidence interval:
 62.68524 76.78143
sample estimates:
mean of x 
 69.73333

>boxplot(data$marks~data$division, ylab="Marks", names=c("A", "B"))
>abline(h=mean(data$marks[data$division=="A"]), lwd=2, col="red")
>abline(h=mean(data$marks[data$division=="B"]), lwd=2, col="blue")

Boxplot for division A and B

> t.test(data$marks~data$division)
Output:
 Welch Two Sample t-test
data: data$marks by data$division
t = -1.1407, df = 57.995, p-value = 0.2587
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -15.243379 4.176712
sample estimates:
mean in group A mean in group B 
   69.73333 75.26667

Table of Contents for How to do it...

Create new playlist

Sign In

Sign Up

Table of Contents for
How to do it...