Chapter 9

Hypotheses Tests

Abstract

This chapter discusses how hypotheses tests are inserted in statistical inference. The concept of hypotheses tests and their goals is presented here, as well as the procedures for constructing them. Hypotheses tests are classified as parametric and nonparametric, and this chapter focuses mainly on parametric tests (nonparametric tests will be discussed in the following chapter). We define the concepts and assumptions of parametric tests, in addition to their respective advantages and disadvantages. We will study the main types of parametric hypotheses tests and the inherent assumptions, including tests for normality, homogeneity of variance tests, Student’s t-test and its applications, besides the ANOVA and its extensions. Thus, it is possible to know when to use each one of the parametric tests. Each test is solved analytically and also through IBM SPSS Statistics Software and Stata Statistical Software. Then, the results obtained are interpreted.

Keywords

Hypotheses tests; Parametric tests; Normality tests; Homogeneity of variance tests; Student’s t-test; ANOVA

We must conduct research and then accept the results. If they don’t stand up to experimentation, Buddha’s own words must be rejected.

Tenzin Gyatso, 14th Dalai Lama

9.1 Introduction

As discussed previously, one of the problems to be solved by statistical inference is hypotheses testing. A statistical hypothesis is an assumption about a certain population parameter, such as, the mean, the standard deviation, the correlation coefficient, etc. A hypothesis test is a procedure to decide the veracity or falsehood of a certain hypothesis. In order for a statistical hypothesis to be validated or rejected with accuracy, it would be necessary to examine the entire population, which in practice is not viable. As an alternative, we draw a random sample from the population we are interested in. Since the decision is made based on the sample, errors may occur (rejecting a hypothesis when it is true or not rejecting a hypothesis when it is false), as we will study later on.

The procedures and concepts necessary to construct a hypothesis test will be presented. Let’s consider X a variable associated to a population and θ a certain parameter of this population. We must define the hypothesis to be tested about parameter θ of this population, which is called null hypothesis:

H0:θ=θ0

si41_e  (9.1)

Let’s also define the alternative hypothesis (H1), in case H0 is rejected, which can be characterized as follows:

H1:θθ0

si42_e  (9.2)

and the test is called bilateral test (or two-tailed test).

The significance level of a test (α) represents the probability of rejecting the null hypothesis when it is true (it is one of the two errors that may occur, as we will see later). The critical region (CR) or rejection region (RR) of a bilateral test is represented by two tails of the same size, respectively, in the left and right extremities of the distribution curve, and each one of them corresponds to half of the significance level α, as shown in Fig. 9.1.

Fig. 9.1
Fig. 9.1 Critical region (CR) of a bilateral test, also emphasizing the nonrejection region (NR) of the null hypothesis.

Another way to define the alternative hypothesis (H1) would be:

H1:θ<θ0

si43_e  (9.3)

and the test is called unilateral test to the left (or left-tailed test).

In this case, the critical region is in the left tail of the distribution and corresponds to significance level α, as shown in Fig. 9.2.

Fig. 9.2
Fig. 9.2 Critical region (CR) of a left-tailed test, also emphasizing the nonrejection region of the null hypothesis (NR).

Or the alternative hypothesis could be:

H1:θ>θ0

si44_e  (9.4)

and the test is called unilateral test to the right (or right-tailed test). In this case, the critical region is in the right tail of the distribution and corresponds to significance level α, as shown in Fig. 9.3.

Fig. 9.3
Fig. 9.3 Critical region (CR) of a right-tailed test.

Thus, if the main objective is to check whether a parameter is significantly higher or lower than a certain value, we have to use a unilateral test. On the other hand, if the objective is to check whether a parameter is different from a certain value, we have to use a bilateral test.

After defining the null hypothesis to be tested, through a random sample collected from the population, we either prove the hypothesis or not. Since the decision is made based on the sample, two types of errors may happen:

Type I error: rejecting the null hypothesis when it is true. The probability of this type of error is represented by α:

P(typeIerror)=P(rejectingH0|H0is true)=α

si45_e  (9.5)

Type II error: not rejecting the null hypothesis when it is false. The probability of this type of error is represented by β:

P(typeIIerror)=P(not rejectingH0|H0is false)=β

si46_e  (9.6)

Table 9.1 shows the types of errors that may happen in a hypothesis test.

Table 9.1

Types of Errors
DecisionH0 Is TrueH0 Is False
Not rejecting H0Correct decision (1 − α)Type II error (β)
Rejecting H0Type I error (α)Correct decision (1 − β)

The procedure for defining hypotheses tests includes the following phases:

  • Step 1: Choosing the most suitable statistical test, depending on the researcher’s intention.
  • Step 2: Presenting the test’s null hypothesis H0 and its alternative hypothesis H1.
  • Step 3: Setting the significance level α.
  • Step 4: Calculating the value observed of the statistic based on the sample obtained from the population.
  • Step 5: Determining the test’s critical region based on the value of α set in Step 3.
  • Step 6: Decision: if the value of the statistic lies in the critical region, reject H0. Otherwise, do not reject H0.

According to Fávero et al. (2009), most statistical softwares, among them SPSS and Stata, calculate the P-value that corresponds to the probability associated to the value of the statistic calculated from the sample. P-value indicates the lowest significance level observed that would lead to the rejection of the null hypothesis. Thus, we reject H0 if P ≤ α.

If we use P-value instead of the statistic’s critical value, Steps 5 and 6 of the construction of the hypotheses tests will be:

  • Step 5: Determine the P-value that corresponds to the probability associated to the value of the statistic calculated in Step 4.
  • Step 6: Decision: if P-value is less than the significance level α established in Step 3, reject H0. Otherwise, do not reject H0.

9.2 Parametric Tests

Hypotheses tests are divided into parametric and nonparametric tests. In this chapter, we will study parametric tests. Nonparametric tests will be studied in the next chapter.

Parametric tests involve population parameters. A parameter is any numerical measure or quantitative characteristic that describes a population. They are fixed values, usually unknown, and represented by Greek characters, such as, the population mean (μ), the population standard deviation (σ), the population variance (σ2), among others.

When hypotheses are formulated about population parameters, the hypothesis test is called parametric. In nonparametric tests, hypotheses are formulated about qualitative characteristics of the population.

Therefore, parametric methods are applied to quantitative data and require strong assumptions in order to be validated, including:

  1. (i) The observations must be independent;
  2. (ii) The sample must be drawn from populations with a certain distribution, usually normal;
  3. (iii) The populations must have equal variances for the comparison tests of two paired population means or k population means (k ≥ 3);
  4. (iv) The variables being studied must be measured in an interval or in a reason scale, so that it can be possible to use arithmetic operations over their respective values.

We will study the main parametric tests, including tests for normality, homogeneity of variance tests, Student’s t-test and its applications, in addition to the analysis of variance (ANOVA) and its extensions. All of them will be solved in an analytical way and also through the statistical softwares SPSS and Stata.

To verify the univariate normality of the data, the most common tests used are Kolmogorov-Smirnov and Shapiro-Wilk. To compare the variance homogeneity between populations, we have Bartlett’s χ2 (1937), Cochran’s C (1947a,b), Hartley’s Fmax (1950), and Levene’s F (1960) tests.

We will describe Student’s t-test for three situations: to test hypotheses about the population mean, to test hypotheses to compare two independent means, and to compare two paired means.

ANOVA is an extension of Student’s t-test and is used to compare the means of more than two populations. In this chapter, ANOVA of one factor, ANOVA of two factors and its extension for more than two factors will be described.

9.3 Univariate Tests for Normality

Among all univariate tests for normality, the most common are Kolmogorov-Smirnov, Shapiro-Wilk, and Shapiro-Francia.

9.3.1 Kolmogorov-Smirnov Test

The Kolmogorov-Smirnov test (K-S) is an adherence test, that is, it compares the cumulative frequency distribution of a set of sample values (values observed) to a theoretical distribution. The main goal is to test if the sample values come from a population with a supposed theoretical or expected distribution, in this case, the normal distribution. The statistic is given by the point with the biggest difference (in absolute values) between the two distributions.

To use the K-S test, the population mean and standard deviation must be known. For small samples, the test loses power, so, it should be used with large samples (n ≥ 30).

The K-S test assumes the following hypotheses:

  • H0: the sample comes from a population with distribution N(μ, σ)
  • H1: the sample does not come from a population with distribution N(μ, σ)

As specified in Fávero et al. (2009), let Fexp(X) be an expected distribution function (normal) of cumulative relative frequencies of variable X, where Fexp(X) ~ N(μ,σ), and Fobs(X) the observed cumulative relative frequency distribution of variable X. The objective is to test whether Fobs(X) = Fexp(X), in contrast with the alternative that Fobs(X) ≠ Fexp(X).

The statistic can be calculated through the following expression:

Dcal=max{|Fexp(Xi)Fobs(Xi)|;|Fexp(Xi)Fobs(Xi1)|},fori=1,,n

si47_e  (9.7)

where:

  • Fexp(Xi): expected cumulative relative frequency in category i;
  • Fobs(Xi): observed cumulative relative frequency in category i;
  • Fobs(Xi − 1): observed cumulative relative frequency in category i − 1.

The critical values of Kolmogorov-Smirnov statistic (Dc) are shown in Table G in the Appendix. This table provides the critical values of Dc considering that P(Dcal > Dc) = α (for a right-tailed test). In order for the null hypothesis H0 to be rejected, the value of the Dcal statistic must be in the critical region, that is, Dcal > Dc. Otherwise, we do not reject H0.

P-value (the probability associated to the value of Dcal statistic calculated from the sample) can also be seen in Table G. In this case, we reject H0 if P ≤ α.

Example 9.1

Using the Kolmogorov-Smirnov Test

Table 9.E.1 shows the data on a company’s monthly production of farming equipment in the last 36 months. Check and see if the data in Table 9.E.1 come from a population that follows a normal distribution, considering that α = 5%.

Table 9.E.1

Production of Farming Equipment in the Last 36 Months
525044504230363448405540
303640425544384240385244
523438444836365550344442

Unlabelled Table

Solution

  • Step 1: Since the objective is to verify if the data in Table 9.E.1 come from a population with a normal distribution, the most suitable test is Kolmogorov-Smirnov (K-S).
  • Step 2: The K-S test hypotheses for this example are:

H0: the production of farming equipment in the population follows distribution N(μ, σ)

H1: the production of farming equipment in the population does not follow distribution N(μ, σ)

  • Step 3: The significance level to be considered is 5%.
  • Step 4: All the steps necessary to calculate Dcal from Expression (9.7) are specified in Table 9.E.2.

Table 9.E.2

Calculating the Kolmogorov-Smirnov Statistic
XiaFabsbFaccFracobsdZieFracexp| Fexp(Xi) − Fobs(Xi)|| Fexp(Xi) − Fobs(Xi − 1)|
30220.056− 1.78010.03750.0180.036
34350.139− 1.21680.11180.0270.056
36490.250− 0.93510.17430.0760.035
383120.333− 0.65340.25670.0770.007
404160.444− 0.37170.35510.0890.022
424200.556− 0.09000.46410.0920.020
445250.6940.19170.57600.1180.020
482270.7500.75510.77490.0250.081
503300.8331.03680.85010.0170.100
523330.9171.31850.90640.0100.073
5533611.74100.95920.0410.043

Unlabelled Table

a Absolute frequency.

b Cumulative (absolute) frequency.

c Observed cumulative relative frequency of Xi.

d Standardized Xi values according to the expression Zi=XiˉXSsi19_e.

e Expected cumulative relative frequency of Xi and it corresponds to the probability obtained in Table E in the Appendix (standard normal distribution table) from the value of Zi.

Therefore, the real value of the K-S statistic based on the sample is Dcal = 0.118.

  • Step 5: According to Table G in the Appendix, for n = 36 and α = 5%, the critical value of the Kolmogorov-Smirnov statistic is Dc = 0.23.
  • Step 6: Decision: since the value calculated is not in the critical region (Dcal < Dc), the null hypothesis is not rejected, which allows us to conclude, with a 95% confidence level, that the sample is drawn from a population that follows a normal distribution.

If we use P-value instead of the statistic’s critical value, Steps 5 and 6 will be:

  • Step 5: According to Table G in the Appendix, for a sample size n = 36, the probability associated to Dcal = 0.118 has as its lowest limit P = 0.20.
  • Step 6: Decision: since P > 0.05, we do not reject H0.

9.3.2 Shapiro-Wilk Test

The Shapiro-Wilk test (S-W) is based on Shapiro and Wilk (1965) and can be applied to samples with 4 ≤ n ≤ 2000 observations, and it is an alternative to the Kolmogorov-Smirnov test for normality (K-S) in the case of small samples (n < 30).

Analogous to the K-S test, the S-W test for normality assumes the following hypotheses:

  • H0: the sample comes from a population with distribution N(μ, σ)
  • H1: the sample does not come from a population with distribution N(μ, σ)

The calculation of the Shapiro-Wilk statistic (Wcal) is given by:

Wcal=b2ni=1(XiˉX)2,fori=1,,n

si48_e  (9.8)

b=n/2i=1ai,n(X(ni+1)X(i))

si49_e  (9.9)

where:

X(i) are the sample statistics of order i, that is, the i-th ordered observation, so, X(1) ≤ X(2) ≤ … ≤ X(n);

ˉXsi20_e is the mean of X;

ai, n are constants generated from the means, variances, and covariances of the statistics of order i of a random sample of size n from a normal distribution. Their values can be seen in Table H2 in the Appendix.

Small values of Wcal indicate that the distribution of the variable being studied is not normal. The critical values of Shapiro-Wilk statistic Wc are shown in Table H1 in the Appendix. Different from most tables, this table provides the critical values of Wc considering that P(Wcal < Wc) = α (for a left-tailed test). In order for the null hypothesis H0 to be rejected, the value of the Wcal statistic must be in the critical region, that is, Wcal < Wc. Otherwise, we do not reject H0.

P-value (the probability associated to the value of Wcal statistic calculated from the sample) can also be seen in Table H1. In this case, we reject H0 if P ≤ α.

Example 9.2

Using the Shapiro-Wilk Test

Table 9.E.3 shows the data on an aerospace company’s monthly production of aircraft in the last 24 months. Check and see if the data in Table 9.E.3 come from a population with a normal distribution, considering that α = 1%.

Table 9.E.3

Production of Aircraft in the Last 24 Months
283246242218203430243129
151923252830323639162336

Unlabelled Table

Solution

  • Step 1: For a normality test in which n < 30, the most recommended test is the Shapiro-Wilk (S-W).
  • Step 2: The S-W test hypotheses for this example are:

H0: the production of aircraft in the population follows normal distribution N(μ, σ)

H1: the production of aircraft in the population does not follow normal distribution N(μ, σ)

  • Step 3: The significance level to be considered is 1%.
  • Step 4: The calculation of the S-W statistic for the data in Table 9.E.3, according to Expressions (9.8) and (9.9), is shown.

First of all, to calculate b, we must sort the data in Table 9.E.3 in ascending order, as shown in Table 9.E.4.

Table 9.E.4

Values From Table 9.E.3 Sorted in Ascending Order
151618192022232324242528
282930303132323436363946

Unlabelled Table

All the steps necessary to calculate b, from Expression (9.9), are specified in Table 9.E.5. The values of ai,n were obtained from Table H2 in the Appendix.

Table 9.E.5

Procedure to Calculate b
in − i + 1ai,nX(n − i + 1)X(i)ai,n (X(n − i + 1) − X(i))
1240.4493461513.9283
2230.309839167.1254
3220.255436184.5972
4210.214536193.6465
5200.180734202.5298
6190.151232221.5120
7180.124532231.1205
8170.099731230.7976
9160.076430240.4584
10150.053930240.3234
11140.032129250.1284
12130.010728280.0000
b = 36.1675

Unlabelled Table

We have ni=1(XiˉX)2=(2827.5)2++(3627.5)2=1388si51_e

Therefore, Wcal=b2ni=1(XiˉX)2=(36.1675)21338=0.978si52_e

  • Step 5: According to Table H1 in the Appendix, for n = 24 and α = 1%, the critical value of the Shapiro-Wilk statistic is Wc = 0.884.
  • Step 6: Decision: the null hypothesis is not rejected, since Wcal > Wc (Table H1 provides the critical values of Wc considering that P(Wcal < Wc) = α), which allows us to conclude, with a 99% confidence level, that the sample is drawn from a population with a normal distribution.

If we use P-value instead of the statistic’s critical value, Steps 5 and 6 will be:

  • Step 5: According to Table H1 in the Appendix, for a sample size n = 24, the probability associated to Wcal = 0.978 is between 0.50 and 0.90 (a probability of 0.90 is associated to Wcal = 0.981).
  • Step 6: Decision: since P > 0.01, we do not reject H0.

9.3.3 Shapiro-Francia Test

This test is based on Shapiro and Francia (1972). According to Sarkadi (1975), the Shapiro-Wilk (S-W) and Shapiro-Francia tests (S-F) have the same format, being different only when it comes to defining the coefficients. Moreover, calculating the S-F test is much simpler and it can be considered a simplified version of the S-W test. Despite its simplicity, it is as robust as the Shapiro-Wilk test, making it a substitute for the S-W.

The Shapiro-Francia test can be applied to samples with 5 ≤ n ≤ 5000 observations, and it is similar to the Shapiro-Wilk test for large samples.

Analogous to the S-W test, the S-F test assumes the following hypotheses:

  • H0: the sample comes from a population with distribution N(μ, σ)
  • H1: the sample does not come from a population with distribution N(μ, σ)

The calculation of the Shapiro-Francia statistic (Wcal′) is given by:

Wcal=[ni=1miX(i)]2/[ni=1m2ini=1(XiˉX)2],fori=1,,n

si53_e  (9.10)

where:

  • X(i) are the sample statistics of order i, that is, the ith ordered observation, so, X(1) ≤ X(2) ≤ … ≤ X(n);
  • mi is the approximate expected value of the ith observation (Z-score). The values of mi are estimated by:

mi=Φ1(in+1)

si54_e  (9.11)

where Φ− 1 corresponds to the opposite of a standard normal distribution with a mean = zero and a standard deviation = 1. These values can be obtained from Table E in the Appendix.

Small values of Wcal′ indicate that the distribution of the variable being studied is not normal. The critical values of Shapiro-Francia statistic (Wc′) are shown in Table H1 in the Appendix. Different from most tables, this table provides the critical values of Wc′ considering that P(Wcal′ < Wc′) = α = α (for a left-tailed test). In order for the null hypothesis H0 to be rejected, the value of the Wcal′ statistic must be in the critical region, that is, Wcal′ < Wc′. Otherwise, we do not reject H0.

P-value (the probability associated to Wcal′ statistic calculated from the sample) can also be seen in Table H1. In this case, we reject H0 if P ≤ α.

Example 9.3

Using the Shapiro-Francia Test

Table 9.E.6 shows all the data regarding a company’s daily production of bicycles in the last 60 months. Check and see if the data come from a population with a normal distribution, considering α = 5%.

Table 9.E.6

Production of Bicycles in the Last 60 Months
857074496788809157636660
728173805554937780646063
675459787384915759646867
707678758081707765635960
617476817978606876717284

Unlabelled Table

Solution

  • Step 1: The normality of the data can be verified through the Shapiro-Francia test.
  • Step 2: The S-F test hypotheses for this example are:

H0: the production of bicycles in the population follows normal distribution N(μ, σ)

H1: the production of bicycles in the population does not follow normal distribution N(μ, σ)

  • Step 3: The significance level to be considered is 5%.
  • Step 4: The procedure to calculate the S-F statistic for the data in Table 9.E.6 is shown in Table 9.E.7.

Table 9.E.7

Procedure to Calculate the Shapiro-Francia Statistic
iX(i)i/(n + 1)mimi X(i)mi2(Xi − ˉXsi20_e)2
1490.0164− 2.1347− 104.59954.5569481.8025
2540.0328− 1.8413− 99.43163.3905287.3025
3540.0492− 1.6529− 89.25412.7319287.3025
4550.0656− 1.5096− 83.02762.2789254.4025
5570.0820− 1.3920− 79.34171.9376194.6025
6570.0984− 1.2909− 73.58411.6665194.6025
7590.1148− 1.2016− 70.89601.4439142.8025
8590.1311− 1.1210− 66.13801.2566142.8025
60930.98362.1347198.52564.5569486.2025
Sum574.670453.19046278.8500

Unlabelled Table

Therefore, Wcal′ = (574.6704)2/(53.1904 × 6278.8500) = 0.989

  • Step 5: According to Table H1 in the Appendix, for n = 60 and α = 5%, the critical value of the Shapiro-Francia statistic is Wc′ = 0.9625.
  • Step 6: Decision: the null hypothesis is not rejected because Wcal′ > Wc′ (Table H1 provides the critical values of Wc′ considering that P(Wcal′ < Wc′) = α), which allows us to conclude, with a 95% confidence level, that the sample is drawn from a population that follows a normal distribution.

If we used P-value instead of the statistic’s critical value, Steps 5 and 6 would be:

  • Step 5: According to Table H1 in the Appendix, for a sample size n = 60, the probability associated to Wcal′ = 0.989 is greater than 0.10 (P-value).
  • Step 6: Decision: since P > 0.05, we do not reject H0.

9.3.4 Solving Tests for Normality by Using SPSS Software

The Kolmogorov-Smirnov and Shapiro-Wilk tests for normality can be solved by using IBM SPSS Statistics Software. The Shapiro-Francia test, on the other hand, will be elaborated through the Stata software, as we will see in the next section.

Based on the procedure that will be described, SPSS shows the results of the K-S and the S-W tests for the sample selected. The use of the images in this section has been authorized by the International Business Machines Corporation©.

Let’s consider the data presented in Example 9.1 that are available in the file Production_FarmingEquipment.sav. Let´s open the file and select Analyze → Descriptive Statistics → Explore …, as shown in Fig. 9.4.

Fig. 9.4
Fig. 9.4 Procedure for elaborating a univariate normality test on SPSS for Example 9.1.

From the Explore dialog box, we must select the variable we are interested in on the Dependent List, as shown in Fig. 9.5. Let´s click on Plots … (the Explore: Plots dialog box will open) and select the option Normality plots with tests (Fig. 9.6). Finally, let’s click on Continue and on OK.

Fig. 9.5
Fig. 9.5 Selecting the variable of interest.
Fig. 9.6
Fig. 9.6 Selecting the normality test on SPSS.

The results of the Kolmogorov-Smirnov and Shapiro-Wilk tests for normality for the data in Example 9.1 are shown in Fig. 9.7.

Fig. 9.7
Fig. 9.7 Results of the tests for normality for Example 9.1 on SPSS.

According to Fig. 9.7, the result of the K-S statistic was 0.118, similar to the value calculated in Example 9.1. Since the sample has more than 30 elements, we should only use the K-S test to verify the normality of the data (the S-W test was applied to Example 9.2). Nevertheless, SPSS also makes the result of the S-W statistic available for the sample selected.

As presented in the introduction of this chapter, SPSS calculates the P-value that corresponds to the lowest significance level observed that would lead to the rejection of the null hypothesis. For the K-S and S-W tests the P-value corresponds to the lowest value of P from which Dcal > Dc and Wcal < Wc. As shown in Fig. 9.7, the value of P for the K-S test was of 0.200 (this probability can also be obtained from Table G in the Appendix, as shown in Example 9.1). Since P > 0.05, we do not reject the null hypothesis, which allows us to conclude, with a 95% confidence level, that the data distribution is normal. The S-W test also allows us to conclude that the data distribution follows a normal distribution.

Applying the same procedure to verify the normality of the data in Example 9.2 (the data are available in the file Production_Aircraft.sav), we get the results shown in Fig. 9.8.

Fig. 9.8
Fig. 9.8 Results of the tests for normality for Example 9.2 on SPSS.

Analogous to Example 9.2, the result of the S-W test was 0.978. The K-S test was not applied to this example due to the sample size (n < 30). The P-value of the S-W test is 0.857 (in Example 9.2, we saw that this probability would be between 0.50 and 0.90 and closer to 0.90) and, since P > 0.01, the null hypothesis is not rejected, which allows us to conclude that the data distribution in the population follows a normal distribution. We will use this test when estimating regression models in Chapter 13.

For this example, we can also conclude from the K-S test that the data distribution follows a normal distribution.

9.3.5 Solving Tests for Normality by Using Stata

The Kolmogorov-Smirnov, Shapiro-Wilk, and Shapiro-Francia tests for normality can be solved by using Stata Statistical Software. The Kolmogorov-Smirnov test will be applied to Example 9.1, the Shapiro-Wilk test to Example 9.2, and the Shapiro-Francia test to Example 9.3. The use of the images in this section has been authorized by StataCorp LP©.

9.3.5.1 Kolmogorov-Smirnov Test on the Stata Software

The data presented in Example 9.1 are available in the file Production_FarmingEquipment.dta. Let’s open this file and verify that the name of the variable being studied is production.

To elaborate the Kolmogorov-Smirnov test on Stata, we must specify the mean and the standard deviation of the variable that interests us in the test syntax, so, the command summarize, or simply sum, must be typed first, followed by the respective variable:

sum production

and we get Fig. 9.9. Therefore, we can see that the mean is 42.63889 and the standard deviation is 7.099911.

Fig. 9.9
Fig. 9.9 Descriptive statistics of the variable production.

The Kolmogorov-Smirnov test is given by the following command:

ksmirnov production = normal((production-42.63889)/7.099911)

The result of the test can be seen in Fig. 9.10. We can see that the value of the statistic is similar to the one calculated in Example 9.1 and by SPSS software. Since P > 0.05, we conclude that the data distribution is normal.

Fig. 9.10
Fig. 9.10 Results of the Kolmogorov-Smirnov test on Stata.

9.3.5.2 Shapiro-Wilk Test on the Stata Software

The data presented in Example 9.2 are available in the file Production_Aircraft.dta. To elaborate the Shapiro-Wilk test on Stata, the syntax of the command is:

swilk variables⁎

where the term variables⁎ should be substituted for the list of variables being considered. For the data in Example 9.2, we have a single variable called production, so, the command to be typed is:

swilk production

The result of the Shapiro-Wilk test can be seen in Fig. 9.11. Since P > 0.05, we can conclude that the sample comes from a population with a normal distribution.

Fig. 9.11
Fig. 9.11 Results of the Shapiro-Wilk test for Example 9.2 on Stata.

9.3.5.3 Shapiro-Francia Test on the Stata Software

The data presented in Example 9.3 are available in the file Production_Bicycles.dta. To elaborate the Shapiro-Francia test on Stata, the syntax of the command is:

sfrancia variables⁎

where the term variables⁎ should be substituted for the list of variables being considered. For the data in Example 9.3, we have a single variable called production, so, the command to be typed is:

sfrancia production

The result of the Shapiro-Francia test can be seen in Fig. 9.12. We can see that the value is similar to the one calculated in Example 9.3 (W ′ = 0.989). Since P > 0.05, we conclude that the sample comes from a population with a normal distribution.

Fig. 9.12
Fig. 9.12 Results of the Shapiro-Francia test for Example 9.3 on Stata.

We will use this test when estimating regression models in Chapter 13.

9.4 Tests for the Homogeneity of Variances

One of the conditions to apply a parametric test to compare k population means is that the population variances, estimated from k representative samples, be homogeneous or equal. The most common tests to verify variance homogeneity are Bartlett’s χ2 (1937), Cochran’s C (1947a,b), Hartley’s Fmax (1950), and Levene’s F (1960) tests.

In the null hypothesis of variance homogeneity tests, the variances of k populations are homogeneous. In the alternative hypothesis, at least one population variance is different from the others. That is:

H0:σ21=σ22==σ2kH1:i,j:σ2iσ2j(i,j=1,,k)

si55_e  (9.12)

9.4.1 Bartlett’s χ2 Test

The original test proposed to verify variance homogeneity among groups is Bartlett’s χ2 test (1937). This test is very sensitive to normality deviations, and Levene’s test is an alternative in this case.

Bartlett’s statistic is calculated from q:

q=(Nk)ln(S2p)ki=1(ni1)ln(S2i)

si56_e  (9.13)

where:

  • ni, i = 1, …, k, is the size of each sample i and ∑i = 1kni = N;
  • Si2, i = 1, …, k, is the variance in each sample i;

and

S2p=ki=1(ni1)S2iNk

si57_e  (9.14)

A correction factor c is applied to q statistic, with the following expression:

c=1+13(k1)(ki=11ni11Nk)

si58_e  (9.15)

where Bartlett’s statistic (Bcal) approximately follows a chi-square distribution with k − 1 degrees of freedom:

Bcal=qcχ2k1

si59_e  (9.16)

From the previous expressions, we can see that the higher the difference between the variances, the higher the value of B. On the other hand, if all the sample variances are equal, its value will be zero. To confirm if the null hypothesis of variance homogeneity will be rejected or not, the value calculated must be compared to the statistic’s critical value (χc2), which is available in Table D in the Appendix.

This table provides the critical values of χc2 considering that P(χcal2 > χc2) = α (for a right-tailed test). Therefore, we reject the null hypothesis if Bcal > χc2. On the other hand, if Bcal ≤ χc2, we do not reject H0.

P-value (the probability associated to χcal2 statistic) can also be obtained from Table D. In this case, we reject H0 if P ≤ α.

Example 9.4

Applying Bartlett’s χ2 Test

A chain of supermarkets wishes to study the number of customers they serve every day in order to make strategic operational decisions. Table 9.E.8 shows the data of three stores throughout two weeks. Check if the variances between the groups are homogeneous. Consider α = 5%.

Table 9.E.8

Number of Customers Served Per Day and Per Store
Store 1Store 2Store 3
Day 1620710924
Day 2630780695
Day 3610810854
Day 4650755802
Day 5585699931
Day 6590680924
Day 7630710847
Day 8644850800
Day 9595844769
Day 10603730863
Day 11570645901
Day 12605688888
Day 13622718757
Day 14578702712
Standard deviation24.405962.246678.9144
Variance595.64843874.64296227.4780

Unlabelled Table

Solution

If we apply the Kolmogorov-Smirnov or the Shapiro-Wilk test for normality to the data in Table 9.E.8, we will verify that their distribution shows adherence to normality, with a 5% significance level, so, Bartlett’s χ2 test can be applied to compare the homogeneity of the variances between the groups.

  • Step 1: Since the main goal is to compare the equality of the variances between the groups, we can use Bartlett’s χ2 test.
  • Step 2: Bartlett’s χ2 test hypotheses for this example are:

H0: the population variances of all three groups are homogeneous

H1: the population variance of at least one group is different from the others

  • Step 3: The significance level to be considered is 5%.
  • Step 4: The complete calculation of Bartlett’s χ2 statistic is shown. First, we calculate the value of Sp2, according to Expression (9.14):

S2p=13×(595.65+3874.64+6227.48)423=3565.92

si60_e

Thus, we can calculate q through Expression (9.13):

q=39ln(3565.92)13[ln(595.65)+ln(3874.64)+ln(6227.48)]=14.94

si61_e

The correction factor c for q statistic is calculated from Expression (9.15):

c=1+(13(31))3(1131423)=1.0256

si62_e

Finally, we calculate Bcal:

Bcal=qc=14.941.0256=14.567

si63_e

  • Step 5: According to Table D in the Appendix, for ν = 3 − 1 degrees of freedom and α = 5%, the critical value of Bartlett’s χ2 test is χc2 = 5.991.
  • Step 6: Decision: since the value calculated lies in the critical region (Bcal > χc2), the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the population variance of at least one group is different from the others.

If we use P-value instead of the statistic’s critical value, Steps 5 and 6 will be:

  • Step 5: According to Table D in the Appendix, for ν = 2 degrees of freedom, the probability associated to χcal2 = 14.567 is less than 0.005 (a probability of 0.005 is associated to χcal2 = 10.597).
  • Step 6: Decision: since P < 0.05, we reject H0.

9.4.2 Cochran’s C Test

Cochran’s C test (1947a,b) compares the group with the highest variance in relation to the others. The test demands that the data have a normal distribution.

Cochran’s C statistic is given by:

Ccal=S2maxki=1S2i

si64_e  (9.17)

where:

Smax2 is the highest variance in the sample;

Si2 is the variance in sample i, i = 1, …, k.

According to Expression (9.17), if all the variances are equal, the value of the Ccal statistic is 1/k. The higher the difference of Smax2 in relation to the other variances, the more the value of Ccal gets closer to 1. To confirm whether the null hypothesis will be rejected or not, the value calculated must be compared to Cochran’s (Cc) statistic’s critical value, which is available in Table M in the Appendix.

The values of Cc vary depending on the number of groups (k), the number of degrees of freedom ν = max(ni − 1), and the value of α. Table M provides the critical values of Cc considering that P(Ccal > Cc) = α (for a right-tailed test). Thus, we reject H0 if Ccal > Cc. Otherwise, we do not reject H0.

Example 9.5

Applying Cochran’s C Test

Use Cochran’s C test for the data in Example 9.4. The main objective here is to compare the group with the highest variability in relation to the others.

Solution

  • Step 1: Since the objective is to compare the group with the highest variance (group 3—see Table 9.E.8) in relation to the others, Cochran’s C test is the most recommended.
  • Step 2: Cochran’s C test hypotheses for this example are:

H0: the population variance of group 3 is equal to the others

H1: the population variance of group 3 is different from the others

  • Step 3: The significance level to be considered is 5%.
  • Step 4: From Table 9.E.8, we can see that Smax2 = 6227.48. Therefore, the calculation of Cochran’s C statistic is given by:

Ccal=S2maxki=1S2i=6227.48595.65+3874.64+6227.48=0.582

si65_e

  • Step 5: According to Table M in the Appendix, for k = 3, ν = 13, and α = 5%, the critical value of Cochran’s C statistic is Cc = 0.575.
  • Step 6: Decision: since the value calculated lies in the critical region (Ccal > Cc), the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the population variance of group 3 is different from the others.

9.4.3 Hartley’s Fmax Test

Hartley’s Fmax test (1950) has the statistic that represents the relationship between the group with the highest variance (Smax2) and the group with the lowest variance (Smin2):

Fmax,cal=S2maxS2min

si66_e  (9.18)

The test assumes that the number of observations per group is equal to (n1 = n2 = … = nk = n). If all the variances are equal, the value of Fmax will be 1. The higher the difference between Smax2 and Smin2, the higher the value of Fmax. To confirm if the null hypothesis of variance homogeneity will be rejected or not, the value calculated must be compared to the (Fmax,c) statistic’s critical value, which is available in Table N in the Appendix. The critical values vary depending on the number of groups (k), the number of degrees of freedom ν = n − 1, and the value of α, and this table provides the critical values of Fmax,c considering that P(Fmax,cal > Fmax,c) = α (for a right-tailed test). Therefore, we reject the null hypothesis H0 of variance homogeneity if Fmax,cal > Fmax,c. Otherwise, we do not reject H0.

P-value (the probability associated to Fmax,cal statistic) can also be obtained from Table N in the Appendix. In this case, we reject H0 if P ≤ α.

Example 9.6

Applying Hartley’s Fmax Test

Use Hartley’s Fmax test for the data in Example 9.4. The goal here is to compare the group with the highest variability to the group with the lowest variability.

Solution

  • Step 1: Since the main objective is to compare the group with the highest variance (group 3—see Table 9.E.8) to the group with the lowest variance (group 1), Hartley’s Fmax test is the most recommended.
  • Step 2: Hartley’s Fmax test hypotheses for this example are:

H0: the population variance of group 3 is the same as group 1

H1: the population variance of group 3 is different from group 1

  • Step 3: The significance level to be considered is 5%.
  • Step 4: From Table 9.E.8, we can see that Smin2 = 595.65 and Smax2 = 6227.48. Therefore, the calculation of Hartley’s Fmax statistic is given by:

Fmax,cal=S2maxS2min=6,227.48595.65=10.45

si67_e

  • Step 5: According to Table N in the Appendix, for k = 3, ν = 13, and α = 5%, the critical value of the test is Fmax,c = 3.953.
  • Step 6: Decision: since the value calculated lies in the critical region (Fmax,cal > Fmax,c), the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the population variance of group 3 is different from the population variance of group 1.

If we use P-value instead of the statistic’s critical value, Steps 5 and 6 will be:

  • Step 5: According to Table N in the Appendix, the probability associated to Fmax,cal = 10.45, for k = 3 and ν = 13, is less than 0.01.
  • Step 6: Decision: since P < 0.05, we reject H0.

9.4.4 Levene’s F-Test

The advantage of Levene’s F-test in relation to other homogeneity of variance tests is that it is less sensitive to deviations from normality, in addition to being considered a more robust test.

Levene’s statistic is given by Expression (9.19) and it follows an F-distribution, approximately, with ν1 = k − 1 and ν2 = N − k degrees of freedom, for a significance level α:

Fcal=(Nk)(k1)ki=1ni(ˉZiˉZ)2ki=1nij=1(ZijˉZi)2~H0Fk1,Nk,α

si68_e  (9.19)

where:

  • ni is the dimension of each one of the k samples (i = 1, …, k);
  • N is the dimension of the global sample (N = n1 + n2 + ⋯ + nk);
  • Zij=|XijˉXi|si69_e, i = 1, …, k and j = 1, …, ni;
  • Xij is observation j in sample i;
  • ˉXisi70_e is the mean of sample i;
  • ˉZisi71_e is the mean of Zij in sample i;
  • ˉZsi72_e is the mean of Zi in the global sample.

An expansion of Levene’s test can be found in Brown and Forsythe (1974).

From the F-distribution table (Table A in the Appendix), we can determine the critical values of Levene’s statistic (Fc = Fk − 1,N − k,α). Table A provides the critical values of Fc considering that P(Fcal > Fc) = α (right-tailed table). In order for the null hypothesis H0 to be rejected, the value of the statistic must be in the critical region, that is, Fcal > Fc. If Fcal ≤ Fc, we do not reject H0.

P-value (the probability associated to Fcal statistic) can also be obtained from Table A. In this case, we reject H0 if P ≤ α.

Example 9.7

Applying Levene’s Test

Elaborate Levene’s test for the data in Example 9.4.

Solution

  • Step 1: Levene’s test can be applied to check variance homogeneity between the groups, and it is more robust than the other tests.
  • Step 2: Levene’s test hypotheses for this example are:

H0: the population variances of all three groups are homogeneous

H1: the population variance of at least one group is different from the others

  • Step 3: The significance level to be considered is 5%.
  • Step 4: The calculation of the Fcal statistic, according to Expression (9.19), is shown.

Table 9.E.9

Calculating the Fcal Statistic
IX1jZ1j=|X1jˉX1|si21_eZ1jˉZ1si22_e(Z1jˉZ1)2si23_e
162010.571− 9.42988.898
163020.5710.5710.327
16100.571− 19.429377.469
165040.57120.571423.184
158524.4294.42919.612
159019.429− 0.5710.327
163020.5710.5710.327
164434.57114.571212.327
159514.429− 5.57131.041
16036.429− 13.571184.184
157039.42919.429377.469
16054.429− 15.571242.469
162212.571− 7.42955.184
157831.42911.429130.612
ˉX1=609.429si24_eˉZ1=20si25_eSum = 2143.429
IX2jZ2j=|X2jˉX2|si26_eZ2jˉZ2si27_e(Z2jˉZ2)2si28_e
271027.214− 23.204538.429
278042.786− 7.63358.257
281072.78622.367500.298
275517.786− 32.6331064.890
269938.214− 12.204148.940
268057.2146.79646.185
271027.214− 23.204538.429
2850112.78662.3673889.686
2844106.78656.3673177.278
27307.214− 43.2041866.593
264592.21441.7961746.899
268849.214− 1.2041.450
271819.214− 31.204973.695
270235.214− 15.204231.164
ˉX2=737.214si29_eˉZ2=50.418si30_eSum = 14,782.192
IX3jZ3j=|X3jˉX3|si31_eZ3jˉZ3si32_e(Z3jˉZ3)2si33_e
392490.64324.194585.344
3695138.35771.9085170.784
385420.643− 45.8062098.201
380231.357− 35.0921231.437
393197.64331.194973.058
392490.64324.194585.344
384713.643− 52.8062788.487
380033.357− 33.0921095.070
376964.357− 2.0924.376
386329.643− 36.8061354.691
390167.6431.1941.425
388854.643− 11.806139.385
375776.3579.90898.172
3712121.35754.9083014.906
ˉX3=833.36si34_eˉZ3=66.449si35_eSum = 19,140.678

Unlabelled Table

Therefore, the calculation of Fcal is carried out as follows:

Fcal=(423)(31)14(2045.62)2+14(50.41845.62)2+14(66.44945.62)22143.429+14,782.192+19,140.678

si73_e

Fcal=8.427

si74_e

  • Step 5: According to Table A in the Appendix, for ν1 = 2, ν2 = 39, and α = 5%, the critical value of the test is Fc = 3.24.
  • Step 6: Decision: since the value calculated lies in the critical region (Fcal > Fc), the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the population variance of at least one group is different from the others.

If we use P-value instead of the statistic’s critical value, Steps 5 and 6 will be:

  • Step 5: According to Table A in the Appendix, for ν1 = 2 and ν2 = 39, the probability associated to Fcal = 8.427 is less than 0.01 (P-value).
  • Step 6: Decision: since P < 0.05, we reject H0.

9.4.5 Solving Levene’s Test by Using SPSS Software

The use of the images in this section has been authorized by the International Business Machines Corporation©. To test the variance homogeneity between the groups, SPSS uses Levene’s test. The data presented in Example 9.4 are available in the file CustomerServices_Store.sav. In order to elaborate the test, we must click on Analyze → Descriptive Statistics → Explore …, as shown in Fig. 9.13.

Fig. 9.13
Fig. 9.13 Procedure for elaborating Levene’s test on SPSS.

Let’s include the variable Customer_services in the list of dependent variables (Dependent List) and the variable Store in the factor list (Factor List), as shown in Fig. 9.14.

Fig. 9.14
Fig. 9.14 Selecting the variables to elaborate Levene’s test on SPSS.

Next, we must click on Plots … and select the option Untransformed in Spread vs Level with Levene Test, as shown in Fig. 9.15.

Fig. 9.15
Fig. 9.15 Continuation of the procedure to elaborate Levene’s test on SPSS.

Finally, let’s click on Continue and on OK. The result of Levene’s test can also be obtained through the ANOVA test, by clicking on Analyze → Compare Means → One-Way ANOVA …. In Options …, we must select the option Homogeneity of variance test (Fig. 9.16).

Fig. 9.16
Fig. 9.16 Results of Levene’s test for Example 9.4 on SPSS.

The value of Levene’s statistic is 8.427, exactly the same as the one calculated previously. Since the significance level observed is 0.001, a value lower than 0.05, the test shows the rejection of the null hypothesis, which allows us to conclude, with a 95% confidence level, that the population variances are not homogeneous.

9.4.6 Solving Levene’s Test by Using the Stata Software

The use of the images in this section has been authorized by StataCorp LP©.

Levene’s statistical test for equality of variances is calculated on Stata by using the command robvar (robust-test for equality of variances), which has the following syntax:

robvar variable⁎, by(groups⁎)

in which the term variable⁎ should be substituted for the quantitative variable studied and the term groups⁎ by the categorical variable that represents them.

Let’s open the file CustomerServices_Store.dta that contains the data of Example 9.7. The three groups are represented by the variable store and the number of customers served by the variable services. Therefore, the command to be typed is:

robvar services, by(store)

The result of the test can be seen in Fig. 9.17. We can verify that the value of the statistic (8.427) is similar to the one calculated in Example 9.7 and to the one generated on SPSS, as well as the calculation of the probability associated to the statistic (0.001). Since P < 0.05, the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the variances are not homogeneous.

Fig. 9.17
Fig. 9.17 Results of Levene’s test for Example 9.7 on Stata.

9.5 Hypotheses Tests Regarding a Population Mean (μ) From One Random Sample

The main goal is to test if a population mean assumes a certain value or not.

9.5.1 Z Test When the Population Standard Deviation (σ) Is Known and the Distribution Is Normal

This test is applied when a random sample of size n is obtained from a population with a normal distribution, whose mean (μ) is unknown and whose standard deviation (σ) is known. If the distribution of the population is not known, it is necessary to work with large samples (n > 30), because the central limit theorem guarantees that, as the sample size grows, the sample distribution of its mean gets closer and closer to a normal distribution.

For a bilateral test, the hypotheses are:

  • H0: the sample comes from a population with a certain mean (μ = μ0)
  • H1: it challenges the null hypothesis (μ ≠ μ0)

The statistical test used here refers to the sample mean (ˉXsi20_e). In order for the sample mean to be compared to the value in the table, it must be standardized, so:

Zcal=ˉXμ0σˉX~N(0,1),whereσˉX=σn

si76_e  (9.20)

The critical values of the zc statistic are shown in Table E in the Appendix. This table provides the critical values of zc considering that P(Zcal > zc) = α (for a right-tailed test). For a bilateral test, we must consider P(Zcal > zc) = α/2, since P(Zcal < − zc) + P(Zcal > zc) = α. The null hypothesis H0 of a bilateral test is rejected if the value of the Zcal statistic lies in the critical region, that is, if Zcal < − zc or Zcal > zc. Otherwise, we do not reject H0.

The unilateral probabilities associated to Zcal statistic (P) can also be obtained from Table E. For a unilateral test, we consider that P = P1. For a bilateral test, this probability must be doubled (P = 2P1). Therefore, for both tests, we reject H0 if P ≤ α.

Example 9.8

Applying the z Test to One Sample

A cereal manufacturer states that the average quantity of food fiber in each portion of its product is, at least, 4.2 g with a standard deviation of 1 g. A health care agency wishes to verify if this statement is true. Collecting a random sample of 42 portions, in which the average quantity of food fiber is 3.9 g. With a significance level equal to 5%, is there evidence to reject the manufacturer’s statement?

Solution

  • Step 1: The suitable test for a population mean with a known σ, considering a single sample of size n > 30 (normal distribution), is the z test.
  • Step 2: For this example, the z test hypotheses are:

H0: μ ≥ 4.2 g (information provided by the supplier)

H1: μ < 4.2 g

which corresponds to a left-tailed test.

  • Step 3: The significance level to be considered is 5%.

Zcal=ˉXμ0σ/n=3.94.21/42=1.94

si77_e

  • Step 5: According to Table E in the Appendix, for a left-tailed test with α = 5%, the critical value of the test is zc = − 1.645.
  • Step 6: Decision: since the value calculated lies in the critical region (zcal < − 1.645), the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the manufacturer’s average quantity of food fiber is less than 4.2 g.

If, instead of comparing the value calculated to the critical value of the standard normal distribution, we use the calculation of P-value, Steps 5 and 6 will be:

  • Step 5: According to Table E in the Appendix, for a left-tailed test, the probability associated to zcal = − 1.94 is 0.0262 (P-value).
  • Step 6: Decision: since P < 0.05, the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the manufacturer’s average quantity of food fiber is less than 4.2 g.

9.5.2 Student’s t-Test When the Population Standard Deviation (σ) Is Not Known

Student’s t-test for one sample is applied when we do not know the population standard deviation (σ), so, its value is estimated from the sample standard deviation (S). However, to substitute σ for S in Expression (9.20), the distribution of the variable will no longer be normal; it will become a Student’s t-distribution with n − 1 degrees of freedom.

Analogous to the z test, Student’s t-test for one sample assumes the following hypotheses for a bilateral test:

  • H0: μ = μ0
  • H1: μ ≠ μ0

And the calculation of the statistic becomes:

Tcal=ˉXμ0S/n~tn1

si78_e  (9.21)

The value calculated must be compared to the value in Student’s t-distribution table (Table B in the Appendix). This table provides the critical values of tc considering that P(Tcal > tc) = α (for a right-tailed test). For a bilateral test, we have P(Tcal < − tc) = α/2 = P(Tcal > tc), as shown in Fig. 9.18.

Fig. 9.18
Fig. 9.18 Nonrejection region (NR) and critical region (CR) of Student’s t-distribution for a bilateral test.

Therefore, for a bilateral test, the null hypothesis is rejected if Tcal < − tc or Tcal > tc. If − tc ≤ Tcal ≤ tc, we do not reject H0.

The unilateral probabilities associated to Tcal statistic (P1) can also be obtained from Table B. For a unilateral test, we have P = P1. For a bilateral test, this probability must be doubled (P = 2P1). Therefore, for both tests, we reject H0 if P ≤ α.

Example 9.9

Applying Student’s t-Test to One Sample

The average processing time of a task using a certain machine has been 18 min. New concepts have been implemented in order to reduce the average processing time. Hence, after a certain period of time, a sample with 25 elements was collected, and an average time of 16.808 min was measured, with a standard deviation of 2.733 min. Check and see if this result represents an improvement in the average processing time. Consider α = 1%.

Solution

  • Step 1: The suitable test for a population mean with an unknown σ is Student’s t-test.
  • Step 2: For this example, Student’s t-test hypotheses are:

H0: μ = 18

H1: μ < 18

which corresponds to a left-tailed test.

  • Step 3: The significance level to be considered is 1%.

Tcal=ˉXμ0S/n=16.808182.733/25=2.18

si79_e

  • Step 5: According to Table B in the Appendix, for a left-tailed test with 24 degrees of freedom and α = 1%, the critical value of the test is tc = − 2.492.
  • Step 6: Decision: since the value calculated is not in the critical region (Tcal > − 2.492), the null hypothesis is not rejected, which allows us to conclude, with a 99% confidence level, that there was no improvement in the average processing time.

If, instead of comparing the value calculated to the critical value of Student’s t-distribution, we use the calculation of P-value, Steps 5 and 6 will be:

  • Step 5: According to Table B in the Appendix, for a left-tailed test with 24 degrees of freedom, the probability associated to Tcal = − 2.18 is between 0.01 and 0.025 (P-value).
  • Step 6: Decision: since P > 0.01, we do not reject the null hypothesis.

9.5.3 Solving Student’s t-Test for a Single Sample by Using SPSS Software

The use of the images in this section has been authorized by the International Business Machines Corporation©.

If we wish to compare means from a single sample, SPSS makes Student’s t-test available. The data in Example 9.9 are available in the file T_test_One_Sample.sav. The procedure to apply the test from Example 9.9 will be described. Initially, let´s select AnalyzeCompare Means → One-Sample T Test …, as shown in Fig. 9.19.

Fig. 9.19
Fig. 9.19 Procedure for elaborating the t-test from one sample on SPSS.

We must select the variable Time and specify the value 18 that will be tested in Test Value, as shown in Fig. 9.20.

Fig. 9.20
Fig. 9.20 Selecting the variable and specifying the value to be tested.

Now, we must click on Options … to define the desired confidence level (Fig. 9.21).

Fig. 9.21
Fig. 9.21 Options—defining the confidence level.

Finally, let’s click on Continue and on OK. The results of the test are shown in Fig. 9.22.

Fig. 9.22
Fig. 9.22 Results of the t-test for one sample for Example 9.9 on SPSS.

This figure shows the result of the t-test (similar to the value calculated in Example 9.9) and the associated probability (P-value) for a bilateral test. For a unilateral test, the associated probability is 0.0195 (we saw in Example 9.9 that this probability would be between 0.01 and 0.025). Since 0.0195 > 0.01, we do not reject the null hypothesis, which allows us to conclude, with a 99% confidence level, that there was no improvement in the average processing time.

9.5.4 Solving Student’s t-Test for a Single Sample by Using Stata Software

The use of the images in this section has been authorized by StataCorp LP©.

Student’s t-test is elaborated on Stata by using the command ttest. For one population mean, the test syntax is:

ttest variable⁎ == #

where the term variable⁎ should be substituted for the name of the variable considered in the analysis and # for the value of the population mean to be tested.

The data in Example 9.9 are available in the file T_test_One_Sample.dta. In this case, the variable being analyzed is called time and the goal is to verify if the average processing time is still 18 min, so, the command to be typed is:

ttest time == 18

The result of the test can be seen in Fig. 9.23. We can see that the calculated value of the statistic (− 2.180) is similar to the one calculated in Example 9.9 and also generated on SPSS, as well as the associated probability for a left-tailed test (0.0196). Since P > 0.01, we do not reject the null hypothesis, which allows us to conclude, with a 99% confidence level, that there was no improvement in the processing time.

Fig. 9.23
Fig. 9.23 Results of the t-test for one sample for Example 9.9 on Stata.

9.6 Student’s t-Test to Compare Two Population Means From Two Independent Random Samples

The t-test for two independent samples is applied to compare the means of two random samples (X1i, i = 1, …, n1; X2j, j = 1, …, n2) obtained from the same population. In this test, the population variance is unknown.

For a bilateral test, the null hypothesis of the test states that the population means are the same. If the population means are different, the null hypothesis is rejected, so:

  • H0: μ1 = μ2
  • H1: μ1 ≠ μ2

The calculation of the T statistic depends on the comparison of the population variances between the groups.

Case 1: σ12 ≠ σ22

Considering that the population variances are different, the calculation of the T statistic is given by:

Tcal=(ˉX1ˉX2)S21n1+S22n2

si80_e  (9.22)

with the following degrees of freedom:

ν=(S21n1+S22n2)2(S21/n1)2(n11)+(S22/n2)2(n21)

si81_e  (9.23)

Case 2: σ12 = σ22

When the population variances are homogeneous, to calculate the T statistic, the researcher has to use:

Tcal=(ˉX1ˉX2)Sp1n1+1n2

si82_e  (9.24)

where:

Sp=(n11)S21+(n21)S22n1+n22

si83_e  (9.25)

and Tcal follows Student’s t-distribution with ν = n1 + n2 − 2 degrees of freedom.

The value calculated must be compared to the value in Student’s t-distribution table (Table B in the Appendix). This table provides the critical values of tc considering that P(Tcal > tc) = α (for a right-tailed test). For a bilateral test, we have P(Tcal < − tc) = α/2 = P(Tcal > tc), as shown in Fig. 9.24.

Fig. 9.24
Fig. 9.24 Nonrejection region (NR) and critical region (CR) of Student’s t-distribution for a bilateral test.

Therefore, for a bilateral test, if the value of the statistic lies in the critical region, that is, if Tcal < − tc or Tcal > tc, the test allows us to reject the null hypothesis. On the other hand, if − tc ≤ Tcal ≤ tc, we do not reject H0.

The unilateral probabilities associated to Tcal statistic (P1) can also be obtained from Table B. For a unilateral test, we have P = P1. For a bilateral test, this probability must be doubled (P = 2P1). Therefore, for both tests, we reject H0 if P ≤ α.

Example 9.10

Applying Student’s t-Test to Two Independent Samples

A quality engineer believes that the average time to manufacture a certain plastic product may depend on the raw materials used, which come from two different suppliers. A sample with 30 observations from each supplier is collected for a test and the results are shown in Tables 9.E.10 and 9.E.11. For a significance level α = 5%, check if there is any difference between the means.

Solution

  • Step 1: The suitable test to compare two population means with an unknown σ is Student’s t-test for two independent samples.
  • Step 2: For this example, Student’s t-test hypotheses are:

H0: μ1 = μ2

H1: μ1 ≠ μ2

  • Step 3: The significance level to be considered is 5%.
  • Step 4: For the data in Tables 9.E.10 and 9.E.11, we calculate ˉX1=24.277si84_e, ˉX2=27.530si85_e, S12 = 1.810, and S22 = 1.559. Considering that the population variances are homogeneous, according to the solution generated on SPSS, let’s use Expressions (9.24) and (9.25) to calculate the Tcal statistic, as follows:

    Table 9.E.10

    Manufacturing Time Using Raw Materials From Supplier 1
    22.823.426.224.322.024.826.725.123.122.8
    25.625.124.324.222.823.224.726.524.523.6
    23.922.825.426.722.923.523.824.626.322.7

    Unlabelled Table

    Table 9.E.11

    Manufacturing Time Using Raw Materials From Supplier 2
    26.829.328.425.629.427.227.626.825.428.6
    29.727.227.928.426.026.827.528.527.329.1
    29.225.728.428.627.927.426.726.825.626.1

    Unlabelled Table

Sp=291.810+291.55930+302=1.298

si86_e

Tcal=24.27727.5301.298130+130=9.708

si87_e

with ν = 30 + 30 – 2 = 58 degrees of freedom.

  • Step 5: The critical region of the bilateral test, considering ν = 58 degrees of freedom and α = 5%, can be defined from Student’s t-distribution table (Table B in the Appendix), as shown in Fig. 9.25.
    Fig. 9.25
    Fig. 9.25 Critical region of Example 9.10.

For a bilateral test, each one of the tails corresponds to half of significance level α.

  • Step 6: Decision: since the value calculated lies in the critical region, that is, Tcal < − 2.002, we must reject the null hypothesis, which allows us to conclude, with a 95% confidence level, that the population means are different.

If, instead of comparing the value calculated to the critical value of Student’s t-distribution, we use the calculation of P-value, Steps 5 and 6 will be:

  • Step 5: According to Table B in the Appendix, for a right-tailed test with ν = 58 degrees of freedom, probability P1 associated to Tcal = 9.708 is less than 0.0005. For a bilateral test, this probability must be doubled (P = 2P1).
  • Step 6: Decision: since P < 0.05, the null hypothesis is rejected.

9.6.1 Solving Student’s t-Test From Two Independent Samples by Using SPSS Software

The data in Example 9.10 are available in the file T_test_Two_Independent_Samples.sav. The procedure for solving Student’s t-test to compare two population means from two independent random samples on SPSS is described. The use of the images in this section has been authorized by the International Business Machines Corporation©.

We must click on Analyze → Compare Means → Independent-Samples T Test …, as shown in Fig. 9.26.

Fig. 9.26
Fig. 9.26 Procedure for elaborating the t-test from two independent samples on SPSS.

Let’s include the variable Time in Test Variable(s) and the variable Supplier in Grouping Variable. Next, let’s click on Define Groups … to define the groups (categories) of the variable Supplier, as shown in Fig. 9.27.

Fig. 9.27
Fig. 9.27 Selecting the variables and defining the groups.

If the confidence level desired by the researcher is different from 95%, the button Options … must be selected to change it. Finally, let’s click on OK. The results of the test are shown in Fig. 9.28.

Fig. 9.28
Fig. 9.28 Results of the t-test for two independent samples for Example 9.10 on SPSS.

The value of the t statistic for the test is − 9.708 and the associated bilateral probability is 0.000 (P < 0.05), which leads us to reject the null hypothesis, and allows us to conclude, with a 95% confidence level, that the population means are different. We can notice that Fig. 9.28 also shows the result of Levene’s test. Since the significance level observed is 0.694, value greater than 0.05, we can also conclude, with a 95% confidence level, that the variances are homogeneous.

9.6.2 Solving Student’s t-Test From Two Independent Samples by Using Stata Software

The use of the images in this section has been authorized by StataCorp LP©.

The t-test to compare the means of two independent groups on Stata is elaborated by using the following syntax:

ttest variable⁎, by(groups⁎)

where the term variable⁎ must be substituted for the quantitative variable being analyzed, and the term groups⁎ for the categorical variable that represents them.

The data in Example 9.10 are available in the file T_test_Two_Independent_Samples.dta. The variable supplier shows the groups of suppliers. The values for each group of suppliers are specified in the variable time. Thus, we must type the following command:

ttest time, by(supplier)

The result of the test can be seen in Fig. 9.29. We can see that the calculated value of the statistic (− 9.708) is similar to the one calculated in Example 9.10 and also generated on SPSS, as well as the associated probability for a bilateral test (0.000). Since P < 0.05, the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the population means are different.

Fig. 9.29
Fig. 9.29 Results of the t-test for two independent samples for Example 9.10 on Stata.

9.7 Student’s t-Test to Compare Two Population Means From Two Paired Random Samples

This test is applied to check whether the means of two paired or related samples, obtained from the same population (before and after) with a normal distribution, are significantly different or not. Besides the normality of the data of each sample, the test requires the homogeneity of the variances between the groups.

Different from the t-test for two independent samples, first, we must calculate the difference between each pair of values in position i (di = Xbefore,i − Xafter,i, i = 1, …, n) and, after that, test the null hypothesis that the mean of the differences in the population is zero.

For a bilateral test, we have:

  • H0: μd = 0, μd = μbefore − μafter
  • H1: μd ≠ 0

The Tcal statistic for the test is given by:

Tcal=ˉdμdSd/n~tν=n1

si88_e  (9.26)

where:

ˉd=ni=1dn

si89_e  (9.27)

and

Sd=ni=1(diˉd)2n1

si90_e  (9.28)

The value calculated must be compared to the value in Student’s t-distribution table (Table B in the Appendix). This table provides the critical values of tc considering that P(Tcal > tc) = α (for a right-tailed test). For a bilateral test, we have P(Tcal < − tc) = α/2 = P(Tcal > tc), as shown in Fig. 9.30.

Fig. 9.30
Fig. 9.30 Nonrejection region (NR) and critical region (CR) of Student’s t-distribution for a bilateral test.

Therefore, for a bilateral test, the null hypothesis is rejected if Tcal < − tc or Tcal > tc. If − tc ≤ Tcal ≤ tc, we do not reject H0.

The unilateral probabilities associated to Tcal statistic (P1) can also be obtained from Table B. For a unilateral test, we have P = P1. For a bilateral test, this probability must be doubled (P = 2P1). Therefore, for both tests, we reject H0 if P ≤ α.

Example 9.11

Applying Student’s t-Test to Two Paired Samples

A group of 10 machine operators, responsible for carrying out a certain task, is trained to perform the same task more efficiently. To verify if there is a reduction in the time taken to perform the task, we measured the time spent by each operator, before and after the training course. Test the hypothesis that the population means of both paired samples are similar, that is, that there is no reduction in time taken to perform the task after the training course. Consider α = 5%.

Table 9.E.12

Time Spent Per Operator Before the Training Course
3.23.63.43.83.43.53.73.23.53.9

Unlabelled Table

Table 9.E.13

Time Spent Per Operator After the Training Course
3.03.33.53.63.43.33.43.03.23.6

Unlabelled Table

Solution

  • Step 1: In this case, the most suitable test is Student’s t-test for two paired samples.

Since the test requires the normality of the data in each sample and the homogeneity of the variances between the groups, K-S or S-W tests, besides Levene’s test, must be applied for such verification. As we will see, in the solution of this example on SPSS, all of these assumptions will be validated.

  • Step 2: For this example, Student’s t-test hypotheses are:

H0: μd = 0

H1: μd ≠ 0

  • Step 3: The significance level to be considered is 5%.
  • Step 4: In order to calculate the Tcal statistic, first, we must calculate di:

ˉd=ni=1din=0.2+0.3++0.310=0.19

si91_e

Sd=(0.20.19)2+(0.30.19)2++(0.30.19)29=0.137

si92_e

Tcal=ˉdSd/n=0.190.137/10=4.385

si93_e

  • Step 5: The critical region of the bilateral test can be defined from Student’s t-distribution table (Table B in the Appendix), considering ν = 9 degrees of freedom and α = 5%, as shown in Fig. 9.31.
    Fig. 9.31
    Fig. 9.31 Critical region of Example 9.11.

Table 9.E.14

Calculating di
Xbefore, i3.23.63.43.83.43.53.73.23.53.9
Xafter, i3.03.33.53.63.43.33.43.03.23.6
di0.20.3−0.10.200.20.30.20.30.3

Unlabelled Table

For a bilateral test, each tail corresponds to half of significance level α.

  • Step 6: Decision: since the value calculated lies in the critical region (Tcal > 2.262), the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that there is a significant difference between the times spent by the operators before and after the training course.

If, instead of comparing the value calculated to the critical value of Student’s t-distribution, we used the calculation of P-value, Steps 5 and 6 will be:

  • Step 5: According to Table B in the Appendix, for a right-tailed test with ν = 9 degrees of freedom, the P1 probability associated to Tcal = 4.385 is between 0.0005 and 0.001. For a bilateral test, this probability must be doubled (P = 2P1), so, 0.001 < P < 0.002.
  • Step 6: Decision: since P < 0.05, the null hypothesis is rejected.

9.7.1 Solving Student’s t-Test From Two Paired Samples by Using SPSS Software

First, we must test the normality of the data in each sample, as well as the variance homogeneity between the groups. Using the same procedures described in Sections 9.3.3 and 9.4.5 (the data must be placed in a table the same way as in Section 9.4.5), we obtain Figs. 9.32 and 9.33.

Fig. 9.32
Fig. 9.32 Results of the normality tests on SPSS.
Fig. 9.33
Fig. 9.33 Results of Levene’s test on SPSS.

Based on Fig. 9.32, we conclude that there is normality of the data for each sample. From Fig. 9.33, we can conclude that the variances between the samples are homogeneous.

The use of the images in this section has been authorized by the International Business Machines Corporation©. To solve Student’s t-test for two paired samples on SPSS, we must open the file T_test_Two_Paired_Samples.sav. Then, we have to click on Analyze → Compare Means → Paired-Samples T Test …, as shown in Fig. 9.34.

Fig. 9.34
Fig. 9.34 Procedure for elaborating the t-test from two paired samples on SPSS.

We must select the variable Before and move it to Variable1 and the variable After to Variable2, as shown in Fig. 9.35.

Fig. 9.35
Fig. 9.35 Selecting the variables that will be paired.

If the desired confidence level is different from 95%, we must click on Options … to change it. Finally, let’s click on OK. The results of the test are shown in Fig. 9.36.

Fig. 9.36
Fig. 9.36 Results of the t-test for two paired samples.

The value of the t-test is 4.385 and the significance level observed for a bilateral test is 0.002, value less than 0.05, which leads us to reject the null hypothesis and allows us to conclude, with a 95% confidence level, that there is a significant difference between the times spent by the operators before and after the training course.

9.7.2 Solving Student’s t-Test From Two Paired Samples by Using Stata Software

The t-test to compare the means of two paired groups will be solved on Stata for the data in Example 9.11. The use of the images in this section has been authorized by StataCorp LP©.

Therefore, let’s open the file T_test_Two_Paired_Samples.dta. The paired variables are called before and after. In this case, we must type the following command:

ttest before == after

The result of the test can be seen in Fig. 9.37. We can see that the calculated value of the statistic (4.385) is similar to the one calculated in Example 9.11 and on SPSS, as well as the probability associated to the statistic for a bilateral test (0.0018). Since P < 0.05, we reject the null hypothesis that the times spent by the operators before and after the training course are the same, with a 95% confidence level.

Fig. 9.37
Fig. 9.37 Results of Student’s t-test for two paired samples for Example 9.11 on Stata.

9.8 ANOVA to Compare the Means of More Than Two Populations

ANOVA is a test used to compare the means of three or more populations, through the analysis of sample variances. The test is based on a sample obtained from each population, aiming at determining if the differences between the sample means suggest significant differences between the population means, or if such differences are only a result of the implicit variability of the sample.

ANOVA’s assumptions are:

  1. (i) The samples must be independent from each other;
  2. (ii) The data in the populations must have a normal distribution;
  3. (iii) The population variances must be homogeneous.

9.8.1 One-Way ANOVA

One-way ANOVA is an extension of Student’s t-test for two population means, allowing the researcher to compare three or more population means.

The null hypothesis of the test states that the population means are the same. If there is at least one group with a mean that is different from the others, the null hypothesis is rejected.

As stated in Fávero et al. (2009), the one-way ANOVA allows researcher to verify the effect of a qualitative explanatory variable (factor) on a quantitative dependent variable. Each group includes the observations of the dependent variable in one category of the factor.

Assuming that size n independent samples are obtained from k populations (k ≥ 3) and that the means of these populations can be represented by μ1, μ2, …, μk, the analysis of variance tests the following hypotheses:

H0:μ1=μ2==μkH1:(i,j)μiμj,ij

si94_e  (9.29)

According to Maroco (2014), in general, the observations for this type of problem can be represented according to Table 9.2.

Table 9.2

Observations of the One-Way ANOVA
Samples or Groups
12K
Y11Y12Y1k
Y21Y22Y2k
Yn11Yn22Ynkk

Table 9.2

where Yij represents observation i of sample or group j (i = 1, …, nj; j = 1, …, k) and nj is the dimension of sample or group j. The dimension of the global sample is N = ∑i = 1kni. Pestana and Gageiro (2008) present the following model:

Yij=μi+ɛij

si95_e  (9.30)

Yij=μ+(μiμ)ɛij

si96_e  (9.31)

Yij=μ+αi+ɛij

si97_e  (9.32)

where:

  • μ is the global mean of the population;
  • μi is the mean of sample or group i;
  • αi is the effect of sample or group i;
  • ɛij is the random error.

Therefore, ANOVA assumes that each group comes from a population with a normal distribution, mean μi, and a homogeneous variance, that is, Yij ~ N(μi,σ), resulting in the hypothesis that the errors (residuals) have a normal distribution with a mean equal to zero and a constant variance, that is, ɛij ~ N(0,σ), besides being independent (Fávero et al., 2009).

The technique’s hypotheses are tested from the calculation of the group variances, and that is where the name ANOVA comes from. The technique involves the calculation of the variations between the groups (ˉYiˉYsi98_e) and within each group (YijˉYisi39_e). The residual sum of squares within groups (RSS) is calculated by:

RSS=ki=1njj=1(YijˉYi)2

si100_e  (9.33)

The residual sum of squares between groups, or the sum of squares of the factor (SSF), is given by:

SSF=ki=1ni(ˉYiˉY)2

si101_e  (9.34)

Therefore, the total sum is:

TSS=RSS+SSF=ki=1nij=1(YijˉY)2

si102_e  (9.35)

According to Fávero et al. (2009) and Maroco (2014), the ANOVA statistic is given by the division between the variance of the factor (SSF divided by k − 1 degrees of freedom) and the variance of the residuals (RSS divided by N − k degrees of freedom):

Fcal=SSF(k1)RSS(Nk)=MSFMSR

si103_e  (9.36)

where:

  • MSF represents the mean square between groups (estimate of the variance of the factor);
  • MSR represents the mean square within groups (estimate of the variance of the residuals).

Table 9.3 summarizes the calculations of the one-way ANOVA.

Table 9.3

Calculating the One-Way ANOVA
Source of VariationSum of SquaresDegrees of FreedomMean SquaresF
Between the groupsSSF=ki=1ni(ˉYiˉY)2si1_ek − 1MSF=SSF(k1)si2_eF=MSFMSRsi3_e
Within the groupsRSS=ki=1nij=1(YijˉYi)2si4_eN − kMSR=RSS(Nk)si5_e
TotalTSS=ki=1nij=1(YijˉY)2si6_eN − 1

Table 9.3

Source: Fávero, L.P., Belfiore, P., Silva, F.L., Chan, B.L., 2009. Análise de dados: modelagem multivariada para tomada de decisões. Campus Elsevier, Rio de Janeiro; Maroco, J., 2014. Análise estatística com o SPSS Statistics, sixth ed. Edições Sílabo, Lisboa.

The value of F can be null or positive, but never negative. Therefore, ANOVA requires an asymmetrical F-distribution to the right.

The calculated value (Fcal) must be compared to the value in the F-distribution table (Table A in the Appendix). This table provides the critical values of Fc = Fk − 1,N − k,α where P(Fcal > Fc) = α (right-tailed test). Therefore, one-way ANOVA’s null hypothesis is rejected if Fcal > Fc. Otherwise, if (Fcal ≤ Fc), we do not reject H0.

We will use these concepts when we study the estimation of regression models in Chapter 13.

Example 9.12

Applying the One-Way ANOVA Test

A sample with 32 products is collected to analyze the quality of the honey supplied by three different suppliers. One of the ways to test the quality of the honey is finding out how much sucrose it contains, which usually varies between 0.25% and 6.5%. Table 9.E.15 shows the percentage of sucrose in the sample collected from each supplier. Check if there are differences in this quality indicator among the three suppliers, considering a 5% significance level.

Table 9.E.15

Percentage of Sucrose for the Three Suppliers
Supplier 1 (n1 = 12)Supplier 2 (n2 = 10)Supplier 3 (n3 = 10)
0.331.541.47
0.791.111.69
1.240.971.55
1.752.572.04
0.942.942.67
2.423.443.07
1.973.023.33
0.873.554.01
0.332.041.52
0.791.672.03
1.24
3.12
ˉY1=1.316si36_eˉY2=2.285si37_eˉY3=2.338si38_e
S1 = 0.850S2 = 0.948S3 = 0.886

Solution

  • Step 1: In this case, the most suitable test is the one-way ANOVA.

First, we must verify the assumptions of normality for each group and of variance homogeneity between the groups through the Kolmogorov-Smirnov, Shapiro-Wilk, and Levene tests. Figs. 9.38 and 9.39 show the results obtained by using SPSS software.

Fig. 9.38
Fig. 9.38 Results of the tests for normality on SPSS.
Fig. 9.39
Fig. 9.39 Results of Levene’s test on SPSS.

Since the significance level observed in the tests for normality for each group and in the variance homogeneity test between the groups is greater than 5%, we can conclude that each one of the groups shows data with a normal distribution and that the variances between the groups are homogeneous, with a 95% confidence level. Since the assumptions of the one-way ANOVA were met, the technique can be applied.

  • Step 2: For this example, ANOVA’s null hypothesis states that there are no differences in the amount of sucrose coming from the three suppliers. If there is at least one supplier with a population mean that is different from the others, the null hypothesis will be rejected. Thus, we have:

H0: μ1 = μ2 = μ3

H1: ∃(i,j) μi ≠ μj, i ≠ j

  • Step 3: The significance level to be considered is 5%.
  • Step 4: The calculation of the Fcal statistic is specified here.

For this example, we know that k = 3 groups and the global sample size is N = 32. The global sample mean is ˉY=1.938si104_e.

The sum of squares between groups (SSF) is:

SSF=12(1.3161.938)2+10(2.2851.938)2+10(2.3381.938)2=7.449

si105_e

Therefore, the mean square between groups (MSB) is:

MSF=SSF(k1)=7.4492=3.725

si106_e

The calculation of the sum of squares within groups (RSS) is shown in Table 9.E.16.

Table 9.E.16

Calculation of the Sum of Squares Within Groups (SSW)
SupplierSucroseYijˉYisi39_e(YijˉYi)2si40_e
10.33− 0.9860.972
10.79− 0.5260.277
11.24− 0.0760.006
11.750.4340.189
10.94− 0.3760.141
12.421.1041.219
11.970.6540.428
10.87− 0.4460.199
10.33− 0.9860.972
10.79− 0.5260.277
11.24− 0.0760.006
13.121.8043.255
21.54− 0.7450.555
21.11− 1.1751.381
20.97− 1.3151.729
22.570.2850.081
22.940.6550.429
23.441.1551.334
23.020.7350.540
23.551.2651.600
22.04− 0.2450.060
21.67− 0.6150.378
31.47− 0.8680.753
31.69− 0.6480.420
31.55− 0.7880.621
32.04− 0.2980.089
32.670.3320.110
33.070.7320.536
33.330.9920.984
34.011.6722.796
31.52− 0.8180.669
32.03− 0.3080.095
RSS23.100

Unlabelled Table

Therefore, the mean square within groups is:

MSR=RSS(Nk)=23.10029=0.797

si107_e

Thus, the value of the Fcal statistic is:

Fcal=MSFMSR=3.7250.797=4.676

si108_e

  • Step 5: According to Table A in the Appendix, the critical value of the statistic is Fc = F2, 29, 5% = 3.33.
  • Step 6: Decision: since the value calculated lies in the critical region (Fcal > Fc), we reject the null hypothesis, which allows us to conclude, with a 95% confidence level, that there is at least one supplier with a population mean that is different from the others.

If, instead of comparing the value calculated to the critical value of Snedecor’s F-distribution, we use the calculation of P-value, Steps 5 and 6 will be:

  • Step 5: According to Table A in the Appendix, for ν1 = 2 degrees of freedom in the numerator and ν2 = 29 degrees of freedom in the denominator, the probability associated to Fcal = 4.676 is between 0.01 and 0.025 (P-value).
  • Step 6: Decision: since P < 0.05, the null hypothesis is rejected.

9.8.1.1 Solving the One-Way ANOVA Test by Using SPSS Software

The use of the images in this section has been authorized by the International Business Machines Corporation©. The data in Example 9.12 are available in the file One_Way_ANOVA.sav. First of all, let´s click on Analyze → Compare Means → One-Way ANOVA …, as shown in Fig. 9.40.

Fig. 9.40
Fig. 9.40 Procedure for the one-way ANOVA.

Let's include the variable Sucrose in the list of dependent variables (Dependent List) and the variable Supplier in the box Factor, according to Fig. 9.41.

Fig. 9.41
Fig. 9.41 Selecting the variables.

After that, we must click on Options … and select the option Homogeneity of variance test (Levene’s test for variance homogeneity). Finally, let’s click on Continue and on OK to obtain the result of Levene’s test, besides the ANOVA table. Since ANOVA does not make the normality test available, it must be obtained by applying the same procedure described in Section 9.3.3.

According to Fig. 9.42, we can verify that each one of the groups has data that follow a normal distribution. Moreover, through Fig. 9.43, we can conclude that the variances between the groups are homogeneous.

Fig. 9.42
Fig. 9.42 Results of the tests for normality for Example 9.12 on SPSS.
Fig. 9.43
Fig. 9.43 Results of Levene’s test for Example 9.12 on SPSS.

From the ANOVA table (Fig. 9.44), we can see that the value of the F-test is 4.676 and the respective P-value is 0.017 (we saw in Example 9.12 that this value would be between 0.01 and 0.025), value less than 0.05. This leads us to reject the null hypothesis and allows us to conclude, with a 95% confidence level, that at least one of the population means is different from the others (there are differences in the percentage of sucrose in the honey of the three suppliers).

Fig. 9.44
Fig. 9.44 Results of the one-way ANOVA for Example 9.12 on SPSS.

9.8.1.2 Solving the One-Way ANOVA Test by Using Stata Software

The use of the images in this section has been authorized by StataCorp LP©.

The one-way ANOVA on Stata is generated from the following syntax:

  • anova variabley⁎ factor⁎

in which the term variabley should be substituted for the quantitative dependent variable and the term factor⁎ for the qualitative explanatory variable.

The data in Example 9.12 are available in the file One_Way_Anova.dta. The quantitative dependent variable is called sucrose and the factor is represented by the variable supplier. Thus, we must type the following command:

anova sucrose supplier

The result of the test can be seen in Fig. 9.45. We can see that the calculated value of the statistic (4.68) is similar to the one calculated in Example 9.12 and also generated on SPSS, as well as the probability associated to the value of the statistic (0.017). Since P < 0.05, the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that at least one of the population means is different from the others.

Fig. 9.45
Fig. 9.45 Results of the one-way ANOVA on Stata.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset