ANOVA

ANOVA is used to determine whether there are any statistically significant differences between the means of three or more independent groups. In the case of only two samples, we can use the t-test to compare the means of the samples, but in the case of more than two samples, it may be very complicated. We are going to study the relationship between quantitative dependent variable returns and single qualitative independent variable stock. We have five levels of stock: stock1, stock2, .. stock5.

We can study the five levels of stock by means of a box plot and we can compare by executing the following code:

> DataANOVA = read.csv("C:/Users/prashant.vats/Desktop/Projects/BOOK R/DataAnova.csv") 
>head(DataANOVA) 

This displays a few lines of the data used for analysis in tabular format:

Returns

Stock

1

1.64

Stock1

2

1.72

Stock1

3

1.68

Stock1

4

1.77

Stock1

5

1.56

Stock1

6

1.95

Stock1

>boxplot(DataANOVA$Returns ~ DataANOVA$Stock) 

This gives the following output and box plots it:

ANOVA

Figure 3.9: Box plot of different levels of stock

The preceding box plot shows that level stock has higher returns. If we repeat the procedure, we are most likely going to get different returns. It may be possible that all the levels of stock give similar numbers and we are just seeing random fluctuation in one set of returns. Let us assume that there is no difference at any level and it is our null hypothesis. Using ANOVA, let us test the significance of the hypothesis:

> oneway.test(Returns ~ Stock, var.equal=TRUE) 

Executing the preceding code gives the following outcome:

ANOVA

Figure 3.10: Output of ANOVA for different levels of stock

Since the Pvalue is less than 0.05, the null hypothesis gets rejected. The returns at the different levels of stock are not similar.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset