Chapter 10

Nonparametric Tests

Abstract

This chapter studies the main types of nonparametric tests and identifies in which situations they must be applied. Nonparametric tests are an alternative to parametric ones when their hypotheses are violated or in cases in which the variables are qualitative. The main differences between parametric and nonparametric tests are presented in this chapter, as well as their respective advantages and disadvantages. The assumptions inherent to nonparametric hypotheses tests are also listed here. As a result, it is possible to identify when to use each one of the nonparametric tests. Each test is solved analytically and via IBM SPSS Statistics Software® and Stata Statistical Software®. The results obtained are also interpreted.

Keywords

Nonparametric tests; Binomial test; Chi-square test; Sign test; McNemar test; Wilcoxon test; Mann-Whitney U test; Cochran’s Q Test; Friedman’s test; Kruskal-Wallis test

Mathematics has wonderful strength that is capable of making us understand many mysteries of our faith.

Saint Jerome

10.1 Introduction

As studied in the previous chapter, hypotheses tests are divided into parametric and nonparametric. Applied to quantitative data, parametric tests, formulate hypotheses about population parameters, such as the population mean (μ), population standard deviation (σ), population variance (σ2), population proportion (p), etc.

Parametric tests require strong assumptions regarding the data distribution. For example, in many cases, we should assume that the samples are collected from populations whose data follow a normal distribution. Or, still, for comparison tests of two paired population means or k population means (k ≥ 3), the population variances must be homogeneous.

Conversely, nonparametric tests can formulate hypotheses about the qualitative characteristics of the population, then, they can be applied to qualitative data, in nominal or ordinal scales. Since assumptions regarding the data distribution are in smaller number and weaker than the parametric tests, they are also known as distribution-free tests.

Nonparametric tests are an alternative to parametric ones when their hypotheses are violated. Given that they require a smaller number of assumptions, they are simpler and easier to apply, but less robust when compared to parametric tests.

In short, the main advantages of nonparametric tests are:

  1. (a) They can be applied in a wide variety of situations, because they do not require strict premises concerning the population, as parametric methods do. Notably, nonparametric methods do not require that the populations have a normal distribution.
  2. (b) Differently from parametric methods, nonparametric methods can be applied to qualitative data, in nominal and ordinal scales.
  3. (c) They are easy to apply because they require simpler calculations when compared to parametric methods.

The main disadvantages are:

  1. (a) With regard to quantitative data, since they must be transformed into qualitative data for the application of nonparametric tests, we lose too much information.
  2. (b) Since nonparametric tests are less efficient than parametric tests, we need greater evidence (a larger sample or one with greater differences) to reject the null hypothesis.

Thus, since parametric tests are more powerful than nonparametric ones, that is, they have a higher probability of rejecting the null hypothesis when it is really false, they must be chosen as long as all the assumptions are confirmed. On the other hand, nonparametric tests are an alternative to parametric ones when the hypotheses are violated or in cases in which the variables are qualitative.

Nonparametric tests are classified according to the variables’ level of measurement and to sample size. For a single sample, we will study the binomial, chi-square (χ2), and sign tests. The binomial test is applied to binary variables. The χ2 test can be applied to nominal variables as well as to ordinal variables. While the sign test is only applied to ordinal variables.

In the case of two paired samples, the main tests are the McNemar test, the sign test, and the Wilcoxon test. The McNemar test is applied to qualitative variables that assume only two categories (binary), while the sign test and the Wilcoxon test are applied to ordinal variables.

Considering two independent samples, we can highlight the χ2 test and the Mann-Whitney U test. The χ2 test can be applied to nominal or ordinal variables, while the Mann-Whitney U test only considers ordinal variables.

For k paired samples (k ≥ 3), we have Cochran’s Q test that considers binary variables and Friedman’s test that considers ordinal variables.

Finally, in the case of more than two independent samples, we will study the χ2 test for nominal or ordinal variables and the Kruskal-Wallis test for ordinal variables.

Table 10.1 shows this classification.

Table 10.1

Classification of Nonparametric Statistical Tests
DimensionLevel of MeasurementNonparametric Test
One sampleBinaryBinomial
Nominal or ordinalχ2
OrdinalSign test
Two paired samplesBinaryMcNemar test
OrdinalSign test
Wilcoxon test
Two independent samplesNominal or ordinalχ2
OrdinalMann-Whitney U
K paired samplesBinaryCochran’s Q
OrdinalFriedman’s test
K independent samplesNominal or ordinalχ2
OrdinalKruskal-Wallis test

Table 10.1

Source: Fávero, L.P., Belfiore, P., Silva, F.L., Chan, B.L., 2009. Análise de dados: modelagem multivariada para tomada de decisões. Campus Elsevier, Rio de Janeiro.

Nonparametric tests in which the variables’ level of measurement is ordinal can also be applied to quantitative variables, but they must only be used in these cases, when the hypotheses of the parametric tests are rejected.

10.2 Tests for One Sample

In this case, a random sample is taken from the population and we test the hypothesis that the sample data have a certain characteristic or distribution. Among the nonparametric statistical tests for a single sample, we can highlight the binomial test, the χ2 test, and the sign test. The binomial test is applied to binary data, the χ2 test to nominal or ordinal data, while the sign test is applied to ordinal data.

10.2.1 Binomial Test

The binomial test is applied to an independent sample in which the variable that the researcher is interested in (X) is binary (dummy) or dichotomous, that is, it only has two possibilities: success or failure. We usually call result X = 1 a success and result X = 0 a failure, because it is more convenient. The probability of success in choosing a certain observation is represented by p and the probability of failure by q, that is:

PX=1=pandPX=0=q=1p

si3_e

For a bilateral test, we must consider the following hypotheses:

  • H0: p = p0
  • H1: p ≠ p0

According to Siegel and Castellan (2006), the number of successes (Y) or the number of results of type [X = 1] results in a sequence of N observations is:

Y=i=1NXi

si4_e

For the authors, in a sample of size N, the probability of obtaining k objects in a category and N − k objects in the other category is given by:

PY=k=NkpkqNkk=0,1,,N

si5_e  (10.1)

where:

  • p: probability of success;
  • q: probability of failure, where:

Nk=N!k!Nk!

si6_e

Table F1 in the Appendix provides the probability of P[Y = k] for several values of N, k, and p.

However, when we test hypotheses, we must use the probability of obtaining values that are greater than or equal to the value observed:

PYk=i=kNNipiqNi

si7_e  (10.2)

Or the probability of obtaining values that are less than or equal to the value observed:

PYk=i=0kNipiqNi

si8_e  (10.3)

According to Siegel and Castellan (2006), when p = q = ½, instead of calculating the probabilities based on the expressions presented, it is more convenient to use Table F2 in the Appendix. This table provides the unilateral probabilities, under the null hypothesis H0: p = 1/2, of obtaining values that are as extreme as or more extreme than k, where k is the lowest of the frequencies observed (P(Y ≤ k)). Due to the symmetry of a binomial distribution, when p = ½, we have P(Y ≥ k) = P(Y ≤ N − k). A unilateral test is used when we predict, in advance, which of both categories must contain the smallest number of cases. For a bilateral test (when the estimate simply refers to the fact that both frequencies will differ), we just need to double the values from Table F2 in the Appendix.

This final value obtained is called P-value, which, according to what was discussed in Chapter 9, corresponds to the probability (unilateral or bilateral) associated to the value observed in the sample. P-value indicates the lowest significance level observed, which would lead to the rejection of the null hypothesis. Thus, we reject H0 if P ≤ α.

In the case of large samples (N > 25), the sample distribution of variable Y is closer to a standard normal distribution, so, the probability can be calculated by the following statistic:

Zcal=NpˆNp0.5Npq

si9_e  (10.4)

where pˆsi10_e refers to the sample estimate of the proportion of successes so that we can test H0.

The value of Zcal calculated by using Expression (10.4) must be compared to the critical value of the standard normal distribution (see Table E in the Appendix). This table provides the critical values of zc where P(Zcal > zc) = α (for a right-tailed unilateral test). For a bilateral test, we have P(Zcal < − zc) = α/2 = P(Zcal > zc).

Therefore, for a right-tailed unilateral test, the null hypothesis is rejected if Zcal > zc. Now, for a bilateral test, we reject H0 if Zcal < − zc or Zcal > zc.

Example 10.1

Applying the Binomial Test to Small Samples

A group of 18 students took an intensive English course and were submitted to two different learning methods. At the end of the course, each student chose his/her favorite teaching method, as shown in Table 10.E.1. We believe there are no differences between both teaching methods. Test the null hypothesis with a significance level of 5%.

Table 10.E.1

Frequencies Obtained After Students Made Their Choice
EventsMethod 1Method 2Total
Frequency11718
Proportion0.6110.3891.0

Unlabelled Table

Solution

Before we start the general procedure to construct the hypotheses tests, we will explain a few parameters in order to facilitate the understanding.

Choosing the method that will be expressed as X = 1 (method 1) and X = 0 (method 2), the probability of choosing method 1 is represented by P[X = 1] = p and method 2 by P[X = 0] = q. The number of successes (Y = k) corresponds to the total number of type X = 1 results and k = 11.

  • Step 1: The most suitable test in this case is the binomial test because the data are categorized into two classes.
  • Step 2: The null hypothesis states that there are no differences in the probabilities of choosing between both methods:

H0: p = q = ½

H1: p ≠ q

  • Step 3: The significance level to be considered is 5%.
  • Step 4: We have N = 18, k = 11, p = ½, and q = ½. Due to the symmetry of the binomial distribution, when p = ½, P(Y ≥ k) = P(Y ≤ N − k), that is, P(Y ≥ 11) = P(Y ≤ 7). So, let’s calculate P(Y ≤ 7) by using Expression (10.3) and show how this probability can be obtained directly from Table F2 in the Appendix.

The probability of a maximum of seven students choosing method 2 is given by:

PY7=PY=0+PY=1++PY=7

si11_e

PY=0=18!0!18!1201218=3.815E06

si12_e

PY=1=18!1!17!1211217=6.866E05

si13_e

PY=7=18!7!11!1271211=0.121

si14_e

Therefore:

PY7=3.815·E06++0.121=0.240

si15_e

Since p = ½, probability P(Y ≤ 7) could be obtained directly from Table F2 in the Appendix. For N = 18 and k = 7 (the lowest frequency observed), the associated unilateral probability is P1 = 0.240.

Since it is a bilateral test, this value must be doubled (P = 2P1), so, the associated bilateral probability is P = 0.480.

Note: In the general procedure of hypotheses tests, Step 4 corresponds to the calculation of the statistic based on the sample. On the other hand, Step 5 determines the probability associated to the value of the statistic obtained from Step 4. In the case of the binomial test, Step 4 calculates the probability associated to the occurrence in the sample directly.

  • Step 5: Decision: since the associated probability is greater than α (P = 0.480 > 0.05), we do not reject H0, which allows us to conclude, with a 95% confidence level, that there are no differences in the probabilities of choosing method 1 or 2.

Example 10.2

Applying the Binomial Test to Large Samples

Redo the previous example considering the following results:

Table 10.E.2

Frequencies Obtained After Students Made Their Choice
EventsMethod 1Method 2Total
Frequency181230
Proportion0.60.41.0

Unlabelled Table

Solution

  • Step 1: Let’s apply the binomial test.
  • Step 2: The null hypothesis states that there are no differences between the probabilities of choosing both methods, that is:

H0: p = q = ½

H1: p ≠ q

  • Step 3: The significance level to be considered is 5%.
  • Step 4: Since N > 25, we can consider that the sample distribution of variable Y is similar to a standard normal distribution, so, the probability can be calculated from Z statistic:

Zcal=NpˆNp0.5Npq=300.6300.50.5300.50.5=0.913

si16_e

  • Step 5: The critical region of a standard normal distribution (Table E in the Appendix), for a bilateral test in which α = 5%, is shown in Fig. 10.1.
    Fig. 10.1
    Fig. 10.1 Critical region of Example 10.2.

For a bilateral test, each one of the tails corresponds to half of significance level α.

  • Step 6: Decision: since the value calculated is not in the critical region, that is, − 1.96 ≤ Zcal ≤ 1.96, the null hypothesis is not rejected, which allows us to conclude, with a 95% confidence level, that there are no differences in the probabilities of choosing between the methods (p = q = ½).

If we used P-value instead of the critical value of the statistic, Steps 5 and 6 would be:

  • Step 5: According to Table E in the Appendix, the unilateral probability associated to statistic Zcal = 0.913 is P1 = 0.1762. For a bilateral test, this probability must be doubled (P-value = 0.3564).
  • Step 6: Decision: since P > 0.05, we do not reject H0.

10.2.1.1 Solving the Binomial Test Using SPSS Software

Example 10.1 will be solved using IBM SPSS Statistics Software®. The use of the images in this section has been authorized by the International Business Machines Corporation©.

The data are available in the file Binomial_Test.sav. The procedure for solving the binomial test using SPSS is described. Let’s select Analyze → Nonparametric Tests → Legacy Dialogs → Binomial … (Fig. 10.2).

Fig. 10.2
Fig. 10.2 Procedure for applying the binomial test on SPSS.

First, let’s insert variable Method into the Test Variable List. In Test Proportion, we must define p = 0.50, since the probability of success and failure is the same (Fig. 10.3).

Fig. 10.3
Fig. 10.3 Selecting the variable and the proportion for the binomial test.

Finally, let’s click on OK. The results can be seen in Fig. 10.4.

Fig. 10.4
Fig. 10.4 Results of the binomial test.

The associated probability for a bilateral test is P = 0.481, similar to the value calculated in Example 10.1. Since P > α (0.481 > 0.05), we do not reject H0, which allows us to conclude, with a 95% confidence level, that p = q = ½.'

10.2.1.2 Solving the Binomial Test Using Stata Software

Example 10.1 will also be solved using Stata Statistical Software®. The use of the images presented in this section has been authorized by Stata Corp LP©. The data are available in the file Binomial_Test.dta.

The syntax of the binomial test on Stata is:

bitest variable⁎ = #p

where the term variable⁎ must be replaced by the variable considered in the analysis and #p by the probability of success specified in the null hypothesis.

In Example 10.1, our studied variable is method and, through the null hypothesis, there are no differences in the choice between both methods, so, the command to be typed is:

bitest method = 0.5

The result of the binomial test is shown in Fig. 10.5. We can see that the associated probability for a bilateral test is P = 0.481, similar to the value calculated in Example 10.1, and also obtained via SPSS software. Since P > 0.05, we do not reject H0, which allows us to conclude, with a 95% confidence level, that p = q = ½.

Fig. 10.5
Fig. 10.5 Results of the binomial test for Example 10.1 on Stata.

10.2.2 Chi-Square Test (χ2) for One Sample

The χ2 test presented in this section is an extension of the binomial test and is applied to a single sample in which the variable being studied assumes two or more categories. The variables can be nominal or ordinal. The test compares the frequencies observed to the frequencies expected in each category.

The χ2 test assumes the following hypotheses:

  • H0: there is no significant difference between the frequencies observed and the ones expected
  • H1: there is a significant difference between the frequencies observed and the ones expected

The statistic for the test, analogous to Expression (4.1) in Chapter 4, is given by:

χcal2=i=1kOiEi2Ei

si17_e  (10.5)

where:

  • Oi: the number of observations in the ith category;
  • Ei: expected frequency of observations in the ith category when H0 is not rejected;
  • k: the number of categories.

The values of χcal2 approximately follow a χ2 distribution with ν = k − 1 degrees of freedom. The critical values of the chi-square (χc2) statistic can be found in Table D in the Appendix, which provides the critical values of χc2, where P(χcal2 > χc2) = α (for a right-tailed unilateral test). In order for the null hypothesis H0 to be rejected, the value of the χcal2 statistic must be in the critical region (CR), that is, χcal2 > χc2. Otherwise, we do not reject H0 (Fig. 10.6).

Fig. 10.6
Fig. 10.6 χ2 distribution, highlighting critical region (CR) and nonrejection of H0 (NR) region.

P-value (the probability associated to the value of the χcal2 statistic calculated from the sample) can also be obtained from Table D. In this case, we reject H0 if P ≤ α.

Example 10.3

Applying the χ2 Test to One Sample

A candy store would like to find out if the number of chocolate candies sold daily varies depending on the day of the week. In order to do that, a sample was collected throughout 1 week, chosen randomly, and the results can be seen in Table 10.E.3. Test the hypothesis that sales do not depend on the day of the week. Assume that α = 5%.

Table 10.E.3

Frequencies Observed Versus Frequencies Expected
EventsSundayMondayTuesdayWednesdayThursdayFridaySaturday
Frequencies observed35242732253631
Frequencies expected30303030303030

Unlabelled Table

Solution

  • Step 1: The most suitable test to compare the frequencies observed to the ones expected from one sample with more than two categories is the χ2 for a single sample.
  • Step 2: Through the null hypothesis, there are no significant differences between the sales observed and the ones expected for each day of the week. On the other hand, through the alternative hypothesis, there is a difference in at least one day of the week:

H0: Oi = Ei

H1: Oi ≠ Ei

  • Step 3: The significance level to be considered is 5%.
  • Step 4: The value of the statistic is given by:

    χcal2=i=1kOiEi2Ei=3530230+2430230++3130230=4.533

    si18_e

  • Step 5: The critical region of the χ2 test, considering α = 5% and ν = 6 degrees of freedom, is shown in Fig. 10.7.
    Fig. 10.7
    Fig. 10.7 Critical Region of Example 10.3.
  • Step 6: Decision: since the value calculated is not in the critical region, that is, χcal2 < 12.592, the null hypothesis is not rejected, which allows us to conclude, with a 95% confidence level, that the number of chocolate candies sold daily does not vary depending on the day of the week.

If we use P-value instead of the critical value of the statistic, Steps 5 and 6 of the construction of the hypotheses tests will be:

  • Step 5: According to Table D in the Appendix, for ν = 6 degrees of freedom, the probability associated to the statistic χcal2 = 4.533 (P-value) is between 0.1 and 0.9.
  • Step 6: Decision: since P > 0.05, we do not reject the null hypothesis.

10.2.2.1 Solving the χ2 Test for One Sample Using SPSS Software

The use of the images in this section has been authorized by the International Business Machines Corporation©.

The data in Example 10.3 are available in the file Chi-Square_One_Sample.sav. The procedure for applying the χ2 test on SPSS is described. First, let’s click on Analyze → Nonparametric Tests → Legacy Dialogs → Chi-Square …, as shown in Fig. 10.8.

Fig. 10.8
Fig. 10.8 Procedure for elaborating the χ2 test on SPSS.

After that, we should insert the variable Day_week into the Test Variable List. The variable being studied has seven categories. The options Get from data and Use specified range (Lower = 1 and Upper = 7) in Expected Range generate the same results. The frequencies expected for the seven categories are exactly the same. Thus, we must select the option All categories equal in Expected Values, as shown in Fig. 10.9.

Fig. 10.9
Fig. 10.9 Selecting the variable and the procedure to elaborate the χ2 test.

Finally, let’s click on OK to obtain the results of the χ2 test, as shown in Fig. 10.10.

Fig. 10.10
Fig. 10.10 Results of the χ2 test for Example 10.3 on SPSS.

Therefore, the value of the χ2 statistic is 4.533, similar to the value calculated in Example 10.3. Since the P-value = 0.605 > 0.05 (in Example 10.3, we saw that 0.1 < P < 0.9), we do not reject H0, which allows us to conclude, with a 95% confidence level, that the sales do not depend on the day of the week.

10.2.2.2 Solving the χ2 Test for One Sample Using Stata Software

The use of the images presented in this section has been authorized by Stata Corp LP©.

The data in Example 10.3 are available in the file Chi-Square_One_Sample.dta. The variable being studied is day_week.

The χ2 test for one sample on Stata can be obtained from the command csgof (chi-square goodness of fit), which allows us to compare the distribution of frequencies observed to the ones expected of a certain categorical variable with more than two categories.

In order for this command to be used, first, we must type:

findit csgof

and install it through the link csgof from http://www.ats.ucla.edu/stat/stata/ado/analysis.

After doing this, we can type the following command:

csgof day_week

The result is shown in Fig. 10.11. We can see that the result of the test is similar to the one calculated in Example 10.3 and on SPSS, as well as to the probability associated to the statistic.

Fig. 10.11
Fig. 10.11 Results of the χ2 test for Example 10.3 on Stata.

10.2.3 Sign Test for One Sample

The sign test is an alternative to the t-test for a single random sample when the data distribution of the population does not follow a normal distribution. The only assumption required by the sign test is that the distribution of the variable be continuous.

The sign test is based on the population median (μ). The probability of obtaining a sample value that is less than the median and the probability of obtaining a sample value that is greater than the median are the same (p = ½). The null hypothesis of the test is that μ is equal to a certain value specified by the investigator (μ0). For a bilateral test, we have:

  • H0: μ = μ0
  • H1: μ ≠ μ0

The quantitative data are converted into signs, (+) or (−), that is, values greater than the median (μ0) start being represented by (+) and values less than μ0 by (−). Data with values equal to μ0 are excluded from the sample. Thus, the sign test is applied to ordinal data and offers little power to the researcher, since this conversion results in a considerable loss of information regarding the original data.

  • Small samples

Let’s establish that N is the number of positive and negative signs (sample size disregarding any ties) and k is the number of signs that corresponds to the lowest frequency.

For small samples (N ≤ 25), we will use the binomial test with p = ½ to calculate P(Y ≤ k). This probability can be obtained directly from Table F2 in the Appendix.

  • Large samples

When N > 25, the binomial distribution is more similar to a normal distribution. The value of Z is given by:

Z=X±0.5N/20.5N~N01

si19_e  (10.6)

where X corresponds to the lowest or highest frequency. If X represents the lowest frequency, we must calculate X + 0.5. On the other hand, if X represents the highest frequency, we must calculate X − 0.5.

Example 10.4

Applying the Sign Test to a Single Sample

We estimate that the median retirement age in a certain Brazilian city is 65. One random sample with 20 retirees was drawn from the population and the results can be seen in Table 10.E.4. Test the null hypothesis that μ = 65, at the significance level of 10%.

Table 10.E.4

Retirement Age
59626637606466707261
64666872789379656759

Unlabelled Table

Solution

  • Step 1: Since the data do not follow a normal distribution, the most suitable test for testing the population median is the sign test.
  • Step 2: The hypotheses of the test are:

H0: μ = 65

H1: μ ≠ 65

  • Step 3: The significance level to be considered is 10%.
  • Step 4: Let’s calculate P(Y ≤ k).

To facilitate our understanding, let’s sort the data in Table 10.E.4 in ascending order.

Table 10.E.5

Data From Table 10.E.4 Sorted in Ascending Order
37595960616264646566
66666768707272787993

Unlabelled Table

Excluding value 65 (a tie), we have the number of (−) signs is 8, the number of (+) signs is 11, and N = 19.

From Table F2 in the Appendix, for N = 19, k = 8, and p = ½, the associated unilateral probability is P1 = 0.324. Since we are using a bilateral test, this value must be doubled, so, the associated bilateral probability is 0.648 (P-value).

  • Step 5: Decision: since P > α (0.648 > 0.10), we do not reject H0, a fact that allows us to conclude, with a 90% confidence level, that μ = 65.

10.2.3.1 Solving the Sign Test for One Sample Using SPSS Software

The use of the images in this section has been authorized by the International Business Machines Corporation©.

SPSS makes the sign test available only for two related samples (2 Related Samples). Thus, in order for us to use the test for a single sample, we must generate a new variable with n values (sample size including ties), all of them equal to μ0. The data in Example 10.4 are available in the file Sign_Test_One_Sample.sav.

The procedure for applying the sign test on SPSS is shown. First of all, we must click on Analyze → Nonparametric Tests → Legacy Dialogs → 2 Related Samples …, as shown in Fig. 10.12.

Fig. 10.12
Fig. 10.12 Procedure for elaborating the sign test on SPSS.

After that, we must insert variable 1 (Age_pop) and variable 2 (Age_sample) into Test Pairs. Let’s select the option regarding the sign test (Sign) in Test Type, as shown in Fig. 10.13.

Fig. 10.13
Fig. 10.13 Selecting the variables and the sign test.

Next, let’s click on OK to obtain the results of the sign test, as shown in Figs. 10.14 and 10.15.

Fig. 10.14
Fig. 10.14 Frequencies observed.
Fig. 10.15
Fig. 10.15 Sign test for Example 10.4 on SPSS.

Fig. 10.14 shows the frequencies of negative and positive signs, the total number of ties, and the total frequency.

Fig. 10.15 shows the associated probability for a bilateral test, which is similar to the value found in Example 10.4. Since P = 0.648 > 0.10, we do not reject the null hypothesis, which allows us to conclude, with a 90% confidence level, that the median retirement age is 65.

10.2.3.2 Solving the Sign Test for One Sample Using Stata Software

The use of the images presented in this section has been authorized by Stata Corp LP©.

Different from SPSS software, Stata makes the sign test for one sample available. On Stata, the sign test for a single sample as well as for two paired samples can be obtained from the command signtest.

The syntax of the test for one sample is:

signtest variable⁎ = #

where the term variable⁎ must be replaced by the variable considered in the analysis and # by the value of the population median to be tested.

The data in Example 10.4 are available in the file Sign_Test_One_Sample.dta. The variable analyzed is age and the main objective is to verify if the median retirement age is 65. The command to be typed is:

signtest age = 65

The result of the test is shown in Fig. 10.16. Analogous to the results presented in Example 10.4 and also generated on SPSS, the number of positive signs is 11, the number of negative signs is 8, and the associated probability for a bilateral test is 0.648. Since P > 0.10, we do not reject the null hypothesis, which allows us to conclude, with a 90% confidence level, that the median retirement age is 65.

Fig. 10.16
Fig. 10.16 Results of the sign test for Example 10.4 on Stata.

10.3 Tests for Two Paired Samples

These tests investigate if two samples are somehow related. The most common examples analyze a situation before and after a certain event. We will study the following tests: the McNemar test for binary variables and the sign and Wilcoxon tests for ordinal variables.

10.3.1 McNemar Test

The McNemar test is applied to assess the significance of changes in two related samples with qualitative or categorical variables that assume only two categories (binary variables). The main goal of the test is to verify if there are any significant changes before and after the occurrence of a certain event. In order to do that, let’s use a 2 × 2 contingency table, as shown in Table 10.2.

Table 10.2

2 × 2 Contingency Table
BeforeAfter
+
+AB
CD

Table 10.2

According to Siegel and Castellan (2006), the + and − signs are used to represent the possible changes in the answers before and after. The frequencies of each occurrence are represented in their respective cells in Table 10.2.

For example, if there are changes from the first answer (+) to the second answer (−), the result will be written in the right upper cell, so, B represents the total number of observations that presented changes in their behavior from (+) to (−).

Analogously, if there are changes from the first answer (−) to the second answer (+), the result will be written in the left lower cell, so, C represents the total number of observations that presented changes in their behavior from (−) to (+).

On the other hand, while A represents the total number of observations that remained with the same answer (+) before and after, D represents the total number of observations with the same answer (−) in both periods.

Thus, the total number of individuals that change their answer can be represented by B + C.

Through the null hypothesis of the test, the total number of changes in each direction is equally likely, that is:

  • H0: P(B → C) = P(C → B)
  • H1: P(B → C) ≠ P(C → B)

According to Siegel and Castellan (2006), McNemar statistic is calculated based on the chi-square (χ2) statistic presented in Expression (10.5), that is:

χcal2=i=12OiEi2Ei=BB+C/22B+C/2+CB+C/22B+C/2=BC2B+C~χ12

si20_e  (10.7)

According to the same authors, a correction factor must be used in order for a continuous χ2 distribution to become more similar to a discrete χ2 distribution, so:

χcal2=BC12B+Cwith1degree of freedom

si21_e  (10.8)

The value calculated must be compared to the critical value of the χ2 distribution (Table D in the Appendix). This table provides the critical values of χc2 where P(χcal2 > χc2) = α (for a right-tailed unilateral test). If the value of the statistic is in the critical region, that is, if χcal2 > χc2, we reject H0. Otherwise, we should not reject H0.

The probability associated to the χcal2 statistic (P-value) can also be obtained from Table D. In this case, the null hypothesis is rejected if P ≤ α. Otherwise, we do not reject H0.

Example 10.5

Applying the McNemar Test

A bill of law proposing the end of full retirement pensions for federal civil servants was being analyzed by the Senate. Aiming at verifying if this measure would bring any changes in the number of people taking public exams, an interview with 60 workers was carried out, before and after the reform, so that they could express their preference in working for a private or a public organization. The results can be seen in Table 10.E.6. Test the hypothesis that there were no significant changes in the workers’ answers before and after the social security reform. Assume that α = 5%.

Table 10.E.6

Contingency Table
Before the ReformAfter the Reform
PrivatePublic
Private223
Public2114

Unlabelled Table

Solution

  • Step 1: McNemar is the most suitable test for evaluating the significance of before and after type changes in two related samples, applied to nominal or categorical variables.
  • Step 2: Through the null hypothesis, the reform would not be efficient in changing people’s preferences towards the private sector. In other words, among the workers who changed their preferences, the probability of them changing their preference from private to public organizations after the reform is the same as the probability of them changing from public to private organizations. That is:

H0: P(Private → Public) = P(Public → Private)

H1: P(Private → Public) ≠ P(Public → Private)

  • Step 3: The significance level to be considered is 5%.
  • Step 4: The value of the statistic, according to Expression (10.7), is:

χcal2=BC2B+C=32123+21=13.5withν=1si22_e

If we use the correction factor, the value of the statistic from Expression (10.8) becomes:

χcal2=BC12B+C=321123+21=12.042withν=1si23_e

  • Step 5: The value of the critical chi-square (χc2) obtained from Table D, in the Appendix, considering α = 5% and ν = 1 degree of freedom, is 3.841.
  • Step 6: Decision: since the value calculated is in the critical region, that is, χcal2 > 3.841, we reject the null hypothesis, which allows us to conclude, with a 95% confidence level, that there were significant changes in the choice of working at a private or a public organization after the social security reform.

If we use P-value instead of the critical value of the statistic, Steps 5 and 6 will be:

  • Step 5: According to Table D in the Appendix, for ν = 1 degree of freedom, the probability associated to statistic χcal2 = 12.042 or 13.5 (P-value) is less than 0.005 (a probability of 0.005 is associated to statistic χcal2 = 7.879).
  • Step 6: Decision: since P < 0.05, we must reject H0.

10.3.1.1 Solving the McNemar Test Using SPSS Software

Example 10.5 will be solved using SPSS software. The use of the images in this section has been authorized by the International Business Machines Corporation©.

The data are available in the file McNemar_Test.sav. The procedure for applying the McNemar test on SPSS is presented. Let’s click on Analyze → Nonparametric Tests → Legacy Dialogs → 2 Related Samples …, as shown in Fig. 10.17.

Fig. 10.17
Fig. 10.17 Procedure for elaborating the McNemar test on SPSS.

After that, we should insert variable 1 (Before) and variable 2 (After) into Test Pairs. Let’s select the McNemar test option in Test Type, as shown in Fig. 10.18.

Fig. 10.18
Fig. 10.18 Selecting the variables and McNemar test.

Finally, we must click on OK to obtain Figs. 10.19 and 10.20. Fig. 10.19 shows the frequencies observed before and after the reform (Contingency Table). The result of the McNemar test is shown in Fig. 10.20.

Fig. 10.19
Fig. 10.19 Frequencies observed.
Fig. 10.20
Fig. 10.20 McNemar Test for Example 10.5 on SPSS.

According to Fig. 10.20, the significance level observed in the McNemar test is 0.000, value lower than 5%, so, the null hypothesis is rejected. Hence, we may conclude, with a 95% confidence level, that there was a significant change in choosing to work at a public or a private organization after the social security reform.

10.3.1.2 Solving the McNemar Test Using Stata Software

Example 10.5 will also be solved using Stata software. The use of the images presented in this section has been authorized by Stata Corp LP©. The data are available in the file McNemar_Test.dta.

The McNemar test can be calculated on Stata by using the command mcc followed by the paired variables. In our example, the paired variables are called before and after, so, the command to be typed is:

mcc before after

The result of the McNemar test is shown in Fig. 10.21. We can see that the value of the statistic is 13.5, similar to the value calculated by Expression (10.7), without the correction factor. The significance level observed from the test is 0.000, lower than 5%, which allows us to conclude, with a 95% confidence level, that there was a significant change before and after the reform.

Fig. 10.21
Fig. 10.21 Results of the McNemar test for Example 10.5 on Stata.

The result of the McNemar test could have also been obtained by using the command mcci 14 21 3 22.

10.3.2 Sign Test for Two Paired Samples

The sign test can also be applied to two paired samples. In this case, the sign is given by the difference between the pairs, that is, if the difference results in a positive number, each pair of values is replaced by a (+) sign. On the other hand, if the result of the difference is negative, each pair of values is replaced by a (−) sign. In case of a tie, the data will be excluded from the sample.

Analogous to the sign test for a single sample, the sign test presented in this section is also an alternative to the t-test for comparing two related samples when the data distribution is not normal. In this case, the quantitative data are transformed into ordinal data. Thus, the sign test is much less powerful than the t-test, because it only uses the difference sign between the pairs as information.

Through the null hypothesis, the population median of the differences (μd) is zero. Therefore, for a bilateral test, we have:

  • H0: μd = 0
  • H1: μd ≠ 0

In other words, we tested the hypothesis that there are no differences between both samples (the samples come from populations with the same median and the same continuous distribution), that is, the number of (+) signs is the same as number of (−) signs.

The same procedure presented in Section 10.2.3 for a single sample will be used in order to calculate the sign statistic in the case of two paired samples.

  • Small samples

We say that N is the number of positive and negative signs (sample size disregarding the ties) and k is the number of signs that corresponds to the lowest frequency. If N ≤ 25, we will use the binomial test with p = ½ to calculate P(Y ≤ k). This probability can be obtained directly from Table F2 in the Appendix.

  • Large samples

When N > 25, the binomial distribution is more similar to a normal distribution, and the value of Z is given by Expression (10.6):

Z=X±0.5N/20.5N~N01

si19_e

where X corresponds to the lowest or highest frequency. If X represents the lowest frequency, we must use X + 0.5. On the other hand, if X represents the highest frequency, we must use X − 0.5.

Example 10.6

Applying the Sign Test to Two Paired Samples

A group of 30 workers are submitted to a training course aiming at improving their productivity. The result, in terms of the average number of parts produced per hour per employee and before and after the training, is shown in Table 10.E.7. Test the null hypothesis that there were no alterations in productivity before and after the training course. Assume that α = 5%.

Table 10.E.7

Productivity Before and After the Training Course
BeforeAfterDifference Sign
3640+
3941+
2729+
4145+
4039
4442
3839+
4240
4042+
4345+
3735
4140
38380
4543
40400
3942+
3841+
39390
4140
3638+
3836
4038
3635
4042+
4041+
3840+
3739+
4042+
3836
40400

Solution

  • Step 1: Since the data do not follow a normal distribution, the sign test can be an alternative to the t-test for two paired samples.
  • Step 2: The null hypothesis assumes that there is no difference in productivity before and after the training course, that is:

H0: μd = 0

H1: μd ≠ 0

  • Step 3: The significance level to be considered is 5%.
  • Step 4: Since N > 25, the binomial distribution is more similar to a normal distribution, and the value of Z is given by:

Z=X±0.5N/20.5N=11+0.5130.526=0.588

si25_e

  • Step 5: By using the standard normal distribution table (Table E in the Appendix), we must determine the critical region (CR) for a bilateral test, as shown in Fig. 10.22.
    Fig. 10.22
    Fig. 10.22 Critical region of Example 10.6.
  • Step 6: Decision: since the value calculated is not in the critical region, that is, − 1.96 ≤ Zcal ≤ 1.96, the null hypothesis is not rejected, which allows us to conclude, with a 95% confidence level, that there is no difference in productivity before and after the training course.

If instead of comparing the value calculated to the critical value of the standard normal distribution, we use the calculation of P-value, Steps 5 and 6 will be:

  • Step 5: According to Table E in the Appendix, the unilateral probability associated to statistic Zcal = − 0.59 is P1 = 0.278. For a bilateral test, this probability must be doubled (P-value = 0.556).
  • Step 6: Decision: since P > 0.05, we reject the null hypothesis.

10.3.2.1 Solving the Sign Test for Two Paired Samples Using SPSS Software

The use of the images in this section has been authorized by the International Business Machines Corporation©.

The data in Example 10.6 can be found in the file Sign_Test_Two_Paired_Samples.sav. The procedure for applying the sign test to two paired samples on SPSS is shown. We have to click on Analyze → Nonparametric Tests → Legacy Dialogs → 2 Related Samples …, as shown in Fig. 10.23.

Fig. 10.23
Fig. 10.23 Procedure for elaborating the sign test on SPSS.

After that, let’s insert variable 1 (Before) and variable 2 (After) into Test Pairs. Let’s also select the option regarding the sign test (Sign) in Test Type, as shown in Fig. 10.24.

Fig. 10.24
Fig. 10.24 Selecting the variables and the sign test.

Finally, let’s click on OK to obtain the results of the sign test for two paired samples (Figs. 10.25 and 10.26).

Fig. 10.25
Fig. 10.25 Frequencies observed.
Fig. 10.26
Fig. 10.26 Sign test (two paired samples) for Example 10.6 on SPSS.

Fig. 10.25 shows the frequencies of negative and positive signs, the total number of ties, and the total frequency.

Fig. 10.26 shows the result of the z test, besides the associated P probability for a bilateral test, values that are similar to the ones calculated in Example 10.6. Since P = 0.556 > 0.05, the null hypothesis is not rejected, which allows us to conclude, with a 95% confidence level, that there is no difference in productivity before and after the training course.

10.3.2.2 Solving the Sign Test for Two Paired Samples Using Stata Software

The use of the images presented in this section has been authorized by Stata Corp LP©.

The data in Example 10.6 also are available on Stata in the file Sign_Test_Two_Paired_Samples.dta. The paired variables are before and after.

As discussed in Section 10.2.3.2 for a single sample, the sign test on Stata is carried out from the command signtest. In the case of two paired samples, we must use the same command. However, it must be followed by the names of the paired variables, with the equal sign between them, since the objective is to test the equality of the respective medians. Thus, the command to be typed for our example is:

signtest after = before

The result of the test is shown in Fig. 10.27 and includes the number of positive signs (15), the number of negative signs (11), as well as the probability associated to the statistic for a bilateral test (P = 0.557). These values are similar to the ones calculated in Example 10.6 and also generated on SPSS. Since P > 0.05, we must reject the null hypothesis, which allows us to conclude, with a 95% confidence level, that there is no difference in productivity before and after the training course.

Fig. 10.27
Fig. 10.27 Results of the sign test (two paired samples) for Example 10.6 on Stata.

10.3.3 Wilcoxon Test

Analogous to the sign test for two paired samples, the Wilcoxon test is an alternative to the t-test when the data distribution does not follow a normal distribution.

The Wilcoxon test is an extension of the sign test; however, it is more powerful. Besides the information about the direction of the differences for each pair, the Wilcoxon test considers the magnitude of the difference within the pairs (Fávero et al., 2009). The logical foundations and the method used in the Wilcoxon test are described, based on Siegel and Castellan (2006).

Let’s assume that di is the difference between the values for each pair of data. First of all, we have to place all of the di’s in ascending order according to their absolute value (without considering the sign) and calculate the respective ranks using this order. For example, position 1 is attributed to the lowest | di |, position 2 to the second lowest, and so on. At the end, we must attribute the di difference sign for each rank. The sum of all positive ranks is represented by Sp and the sum of all negative ranks by Sn.

Occasionally, the values for a certain pair of data are the same (di = 0). In this case, they are excluded from the sample. It is the same procedure used in the sign test, so, the value of N represents the sample size disregarding these ties.

Another type of tie may happen, in which two or more differences have the same absolute value. In this case, the same rank will be attributed to the ties, which will correspond to the mean of the ranks that would have been attributed if the differences had been different. For example, suppose that three pairs of data indicate the following differences: − 1, 1, and 1. Rank 2 is attributed to each pair, which corresponds to the average between 1, 2, and 3. In order, the next value will receive rank 4, since ranks 1, 2, and 3 have already been used.

The null hypothesis assumes that the median of the differences in the population (μd) is zero, that is, the populations do not differ in location. For a bilateral test, we have:

  • H0: μd = 0
  • H1: μd ≠ 0

In other words, we must test the hypothesis that there are no differences between both samples (the samples come from populations with the same median and the same continuous distribution), that is, the sum of the positive ranks (Sp) is the same as the sum of the negative ranks (Sn).

  • Small samples

If N ≤ 15, Table I in the Appendix shows the unilateral probabilities associated to the several critical values of Sc (P(Sp > Sc) = α). For a bilateral test, this value must be doubled. If the probability obtained (P-value) is less than or equal to α, we must reject H0.

  • Large samples

As N grows, the Wilcoxon distribution becomes more similar to a standard normal distribution. Thus, for N > 15, we must calculate the value of variable z that, according to Siegel and Castellan (2006), Fávero et al. (2009), and Maroco (2014), is:

Zcal=minSpSnNN+14NN+12N+124j=1gtj3j=1gtj48

si26_e  (10.9)

where:

  • j=1gtj3j=1gtj48si27_e is a correction factor whenever there are ties;
  • g: the number of groups of tied ranks;
  • tj: the number of tied observations in group j.

The value calculated must be compared to the critical value of the standard normal distribution (Table E in the Appendix). This table provides the critical values of zc where P(Zcal > zc) = α (for a right-tailed unilateral test). For a bilateral test, we have P(Zcal < − zc) = P(Zcal > zc) = α/2. The null hypothesis H0 of a bilateral test is rejected if the value of the Zcal statistic is in the critical region, that is, if Zcal < − zc or Zcal > zc. Otherwise, we do not reject H0.

The unilateral probabilities associated to statistic Zcal (P1) can also be obtained from Table E. For a unilateral test, we consider P = P1. For a bilateral test, this probability must be doubled (P = 2P1). Thus, for both tests, we reject H0 if P ≤ α.

Example 10.7

Applying the Wilcoxon Test

A group of 18 students from the 12th grade took an English proficiency exam, without ever having taken an extracurricular course. The same group of students was submitted to an intensive English course for 6 months and, at the end, they took the proficiency exam again. The results can be seen in Table 10.E.8. Test the hypothesis that there was no improvement before and after the course. Assume that α = 5%.

Table 10.E.8

Students' Grades Before and After the Intensive Course
BeforeAfter
5660
6562
7074
7879
4753
5259
6465
7075
7275
7888
8078
2626
5563
6059
7171
6675
6071
1724

Solution

  • Step 1: Since the data do not follow a normal distribution, the Wilcoxon test can be applied, because it is more powerful than the sign test for two paired samples.
  • Step 2: Through the null hypothesis, there is no difference in the students’ performance before and after the course, that is:

H0: μd = 0

H1: μd ≠ 0

  • Step 3: The significance level to be considered is 5%.
  • Step 4: Since N > 15, the Wilcoxon distribution is more similar to a normal distribution. In order to calculate the value of z, first of all, we have to calculate di and the respective ranks, as shown in Table 10.E.9.

Table 10.E.9

Calculation of di and the Respective Ranks
BeforeAfterdidi’s Rank
566047.5
6562− 3− 5.5
707447.5
787912
4753610
5259711.5
646512
707559
727535.5
78881015
8078− 2− 4
26260
5563813
6059− 1− 2
71710
6675914
60711116
1724711.5

Unlabelled Table

Since there are two pairs of data with equal values (di = 0), they are excluded from the sample, so, N = 16. The sum of the positive ranks is Sp = 2 + ⋯ + 16 = 124.5. The sum of the negative ranks is Sn = 2 + 4 + 5.5 = 11.5.

Thus, we can calculate the value of z by using Expression (10.9):

Zcal=minSpSnNN+14NN+12N+124j=1gtj3j=1gtj48=11.51617416173324591148=2.925

si28_e

  • Step 5: By using the standard normal distribution table (Table E in the Appendix), we determine the critical region (CR) for the bilateral test, as shown in Fig. 10.28.
    Fig. 10.28
    Fig. 10.28 Critical region of Example 10.7.
  • Step 6: Decision: since the value calculated is in the critical region, that is, Zcal < − 1.96, the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that there is a difference in the students’ performance before and after the course.

If instead of comparing the value calculated to the critical value of the standard normal distribution, we use the calculation of the P-value, Steps 5 and 6 will be:

  • Step 5: According to Table E in the Appendix, the unilateral probability associated to statistic Zcal = − 2.925 is p1 = 0.0017. For a bilateral test, this probability must be doubled (P-value = 0.0034).
  • Step 6: Decision: since P < 0.05, we must reject the null hypothesis.

10.3.3.1 Solving the Wilcoxon Test Using SPSS Software

The use of the images in this section has been authorized by the International Business Machines Corporation©.

The data in Example 10.7 are available in the file Wilcoxon_Test.sav. The procedure for applying the Wilcoxon test to two paired samples on SPSS is shown. Let’s click on Analyze → Nonparametric Tests → Legacy Dialogs → 2 Related Samples …, as shown in Fig. 10.29.

Fig. 10.29
Fig. 10.29 Procedure for elaborating the Wilcoxon test on SPSS.

First of all, let’s insert variable 1 (Before) and variable 2 (After) into Test Pairs. Let’s also select the option related to the Wilcoxon test in Test Type, as shown in Fig. 10.30.

Fig. 10.30
Fig. 10.30 Selecting the variables and Wilcoxon test.

Finally, let’s click on OK to obtain the results of the Wilcoxon test for two paired samples (Figs. 10.31 and 10.32).

Fig. 10.31
Fig. 10.31 Ranks.
Fig. 10.32
Fig. 10.32 Wilcoxon test for Example 10.7 on SPSS.

Fig. 10.31 shows the number of negative, positive, and tied ranks, besides the mean and the sum of all positive and negative ranks.

Fig. 10.32 shows the result of the z test, besides the associated P probability for a bilateral test, values similar to the ones found in Example 10.7. Since P = 0.003 < 0.05, we must reject the null hypothesis, which allows us to conclude, with a 95% confidence level, that there is a difference in the students’ performance before and after the course.

10.3.3.2 Solving the Wilcoxon Test Using Stata Software

The use of the images presented in this section has been authorized by Stata Corp LP©.

The data in Example 10.7 are available in the file Wilcoxon_Test.dta. The paired variables are called before and after.

The Wilcoxon test on Stata is carried out from the command signrank followed by the name of the paired variables with an equal sign between them. For our example, we must type the following command:

signrank before = after

The result of the test is shown in Fig. 10.33. Since P < 0.05, we reject the null hypothesis, which allows us to conclude, with a 95% confidence level, that there is a difference in the students’ performance before and after the course.

Fig. 10.33
Fig. 10.33 Results of the Wilcoxon test for Example 10.7 on Stata.

10.4 Tests for Two Independent Samples

In these tests, we try to compare two populations represented by their respective samples. Different from the tests for two paired samples, here, it is not necessary for the samples to have the same size. Among the tests for two independent samples, we can highlight the chi-square test (for nominal or ordinal variables) and the Mann-Whitney test for ordinal variables.

10.4.1 Chi-Square Test (χ2) for Two Independent Samples

In Section 10.2.2, the χ2 test was applied to a single sample in which the variable being studied was qualitative (nominal or ordinal). Here the test will be applied to two independent samples, from nominal or ordinal qualitative variables. This test has already been studied in Chapter 4 (Section 4.2.2), in order to verify if there is an association between two qualitative variables, and it will be described once again in this section.

The test compares the frequencies observed in each one of the cells of a contingency table to the frequencies expected. The χ2 test for two independent samples assumes the following hypotheses:

  • H0: there is no significant difference between the frequencies observed and the ones expected
  • H1: there is a significant difference between the frequencies observed and the ones expected

Therefore, the χ2 statistic measures the discrepancy between a table with the contingency observed and a table with the contingency expected, starting from the hypothesis that there is no connection between the categories of both variables studied. If the distribution of frequencies observed is exactly the same as the distribution of frequencies expected, the result of the χ2 statistic is zero. Thus, a low value of χ2 indicates independence between the variables.

As already presented in Expression (4.1) in Chapter 4, the χ2 statistic for two independent samples is given by:

χ2=i=1Ij=1JOijEij2Eij

si29_e  (10.10)

where:

  • Oij: the number of observations in the ith category of variable X and in the jth category of variable Y;
  • Eij: frequency expected of observations in the ith category of variable X and in the jth category of variable Y;
  • I: the number of categories (rows) of variable X;
  • J: the number of categories (columns) of variable Y.

The values of χcal2 approximately follow an χ2 distribution with ν = (I − 1)·(J − 1) degrees of freedom. The critical values of the chi-square statistic (χc2) can be found in Table D, in the Appendix. This table provides the critical values of χc2 where P(χcal2 > χc2) = α (for a right-tailed unilateral test). In order for the null hypothesis H0 to be rejected, the value of the χcal2 statistic must be in the critical region, that is, χcal2 > χc2. Otherwise, we do not reject H0 (Fig. 10.34).

Fig. 10.34
Fig. 10.34 χ2 distribution.

Example 10.8

Applying the χ2 Test to Two Independent Samples

Let’s consider Example 4.1 in Chapter 4 once again, which refers to a study carried out with 200 individuals aiming at analyzing the joint behavior of variable X (Health insurance agency) with variable Y (Level of satisfaction). The contingency table showing the joint distribution of the variables’ absolute frequencies, besides the marginal totals, is presented in Table 10.E.10. Test the hypothesis that there is no association between the categories of both variables, considering α = 5%.

Table 10.E.10

Joint Distribution of the Absolute Frequencies of the Variables Being Studied
AgencyLevel of Satisfaction
DissatisfiedNeutralSatisfiedTotal
Total Health40161268
Live Life32241672
Mena Health2432460
Total967232200

Unlabelled Table

Solution

  • Step 1: The most suitable test to compare the frequencies observed in each cell of a contingency table to the frequencies expected is the χ2 for two independent samples.
  • Step 2: The null hypothesis states that there are no connections between the categories of variables Agency and Level of satisfaction, that is, the frequencies observed and expected are the same for each pair of variable categories. The alternative hypothesis states that there are differences in at least one pair of categories:

H0: Oij = Eij

H1: Oij ≠ Eij

  • Step 3: The significance level to be considered is 5%.
  • Step 4: In order to calculate the statistic, it is necessary to compare the values observed and the values expected. Table 10.E.11 presents the distribution’s values observed with their respective relative frequencies in relation to the row’s general total. The calculation could also be done in relation to the column’s general total, achieving the same result as the χ2 statistic.

Table 10.E.11

Values Observed in Each Category With Their Respective Proportions in Relation to the Row’s General Total
AgencyLevel of SatisfactionTotal
DissatisfiedNeutralSatisfied
Total Health40 (58.8%)16 (23.5%)12 (17.6%)68 (100%)
Live Life32 (44.4%)24 (33.3%)16 (22.2%)72 (100%)
Mena Health24 (40%)32 (53.3%)4 (6.7%)60 (100%)
Total96 (48%)72 (36%)32 (16%)200 (100%)

Unlabelled Table

The data in Table 10.E.11 demonstrate a dependence between the variables. Supposing that there was no connection between the variables, we would expect a proportion of 48% in relation to the total of the Dissatisfied row for all three agencies, 36% in the Neutral level, and 16% in the Satisfied level. The calculations of the values expected can be found in Table 10.E.12. For example, the calculation of the first cell is 0.48 × 68 = 32.6.

Table 10.E.12

Values Expected From Table 10.E.11 Assuming a Nonassociation Between the Variables
AgencyLevel of SatisfactionTotal
DissatisfiedNeutralSatisfied
Total Health32.6 (48%)24.5 (36%)10.9 (16%)68 (100%)
Live Life34.6 (48%)25.9 (36%)11.5 (16%)72 (100%)
Mena Health28.8 (48%)21.6 (36%)9.6 (16%)60 (100%)
Total96 (48%)72 (36%)32 (16%)200 (100%)

Unlabelled Table

In order to calculate the χ2 statistic, we must apply Expression (10.10) to the data in Tables 10.E.11 and 10.E.12. The calculation of each term OijEij2Eijsi30_e is represented in Table 10.E.13, jointly with the resulting χcal2 measure of the sum of the categories.

  • Step 5: The critical region (CR) of the χ2 distribution (Table D in the Appendix), considering α = 5% and ν = (I − 1)·(J − 1) = 4 degrees of freedom, is shown in Fig. 10.35.
    Fig. 10.35
    Fig. 10.35 Critical region of Example 10.8.
  • Step 6: Decision: since the value calculated is in the critical region, that is, χcal2 > 9.488, we must reject the null hypothesis, which allows us to conclude, with a 95% confidence level, that there is an association between the variable categories.

Table 10.E.13

Calculation of the χ2 Statistic
AgencyLevel of Satisfaction
DissatisfiedNeutralSatisfied
Total Health1.662.940.12
Live Life0.190.141.74
Mena Health0.805.013.27
Totalχcal2 = 15.861

Unlabelled Table

If we use P-value instead of the critical value of the statistic, Steps 5 and 6 will be:

  • Step 5: According to Table D, in the Appendix, the probability associated to the χcal2 statistic = 15.861, for ν = 4 degrees of freedom, is less than 0.005.
  • Step 6: Decision: since P < 0.05, we reject H0.

10.4.1.1 Solving the χ2 Statistic Using SPSS Software

The use of the images in this section has been authorized by the International Business Machines Corporation©.

The data in Example 10.8 are available in the file HealthInsurance.sav. In order to calculate the χ2 statistic for two independent samples, we must click on Analyze → Descriptive Statistics → Crosstabs … Let’s insert variable Agency in Row(s) and variable Satisfaction in Column(s), as shown in Fig. 10.36.

Fig. 10.36
Fig. 10.36 Selecting the variables.

In Statistics …, let’s select option Chi-square, as shown in Fig. 10.37. Then, we must finally click on Continue and OK. The result is shown in Fig. 10.38.

Fig. 10.37
Fig. 10.37 Selecting the χ2 statistic.
Fig. 10.38
Fig. 10.38 Results of the χ2 test for Example 10.8 on SPSS.

From Fig. 10.38, we can see that the value of χ2 is 15.861, similar to what was calculated in Example 10.8. For the confidence level of 95%, as P = 0.003 < 0.05, we must reject the null hypothesis, which allows us to conclude, with a 95% confidence level, that there is an association between the variable categories, that is, the frequencies observed differ from the frequencies expected in at least one pair of categories.

10.4.1.2 Solving the χ2 Statistic by Using Stata Software

The use of the images presented in this section has been authorized by Stata Corp LP©.

As presented in Chapter 4, the calculation of the χ2 statistic on Stata is done by using the command tabulate, or simply tab, followed by the name of the variables being studied, using option chi2, or simply ch. The syntax of the test is:

tab variable1⁎ variable2⁎, ch

The data in Example 10.8 are also available in the file HealthCareInsurance.dta. The variables being studied are agency and satisfaction. Thus, we must type the following command:

tab agency satisfaction, ch

The results can be seen in Fig. 10.39 and are similar to the ones presented in Example 10.8 and on Stata.

Fig. 10.39
Fig. 10.39 Results of the χ2 test for Example 10.8 on Stata.

10.4.2 Mann-Whitney U Test

The Mann-Whitney U test is one of the most powerful nonparametric tests, applied to quantitative or qualitative variables in an ordinal scale, and it aims at verifying if two nonpaired or independent samples are drawn from the same population. It is an alternative to Student’s t-test when the normality hypothesis is violated or when the sample is small. In addition, it may be considered a nonparametric version of the t-test for two independent samples.

Since the original data are transformed into ranks (orders), we lose some information, so, the Mann-Whitney U test is not as powerful as the t-test.

Different from the t-test that verifies the equality of the means of two independent populations and with continuous data, the Mann-Whitney U test verifies the equality of the medians. For a bilateral test, the null hypothesis is that the median of both populations is equal, that is:

H0: μ1 = μ2

H1: μ1 ≠ μ2

The calculation of the Mann-Whitney U statistic is specified, for small and large samples.

  • Small samples

Method:

  1. (a) Let’s consider N1 the size of the sample with the smallest number of observations and N2 the size of the sample with the largest number of observations. We assume that both samples are independent.
  2. (b) In order to apply the Mann-Whitney U test, we must join both samples into a single combined sample that will be formed by N = N1 + N2 elements. However, we must identify the original sample of each observation in the combined sample. The combined sample must be ordered in ascending order and the ranks are attributed to each observation. For example, rank 1 is attributed to the lowest observation and rank N to the highest observation. If there are ties, we attribute the mean of the corresponding ranks.
  3. (c) After that, we must calculate the sum of the ranks for each sample, that is, calculate R1, which corresponds to the sum of the ranks in the sample with the smallest number of observations, and R2, which corresponds to sum of the ranks in the sample with the largest number of observations.
  4. (d) Thus, we can calculate quantities U1 and U2 as follows:

U1=N1N2+N1N1+12R1

si31_e  (10.11)

U2=N1N2+N2N2+12R2

si32_e  (10.12)

  1. (e) The Mann-Whitney U statistic is given by:

Ucal=minU1U2

si33_e

Table J in the Appendix shows the critical values of U in a way that P(Ucal < Uc) = α (for a left-tailed unilateral test), for values of N2 ≤ 20 and significance levels of 0.05, 0.025, 0.01, and 0.005. In order for the null hypothesis H0 of the left-tailed unilateral test to be rejected, the value of the Ucal statistic must be in the critical region, that is, Ucal < Uc. Otherwise, we do not reject H0. For a bilateral test, we must consider P(Ucal < Uc) = α/2, since P(Ucal < Uc) +P(Ucal > Uc) = α.

The unilateral probabilities associated to the Ucal statistic (P1) can also be obtained from Table J. For a unilateral test, we have P = P1. For a bilateral test, this probability must be doubled (P = 2P1). Thus, we reject H0 if P ≤ α.

  • Large samples

As the sample size grows (N2 > 20), the Mann-Whitney distribution becomes more similar to a standard normal distribution.

The real value of the Z statistic is given by:

Zcal=UN1N2/2N1N2NN1N3N12j=1gtj3j=1gtj12

si34_e  (10.13)

where:

  • j=1gtj3j=1gtj12si35_e is a correction factor when there are ties;
  • g: the number of groups with tied ranks;
  • tj: the number of tied observations in group j.

The value calculated must be compared to the critical value of the standard normal distribution (see Table E in the Appendix). This table provides the critical values of zc where P(Zcal > zc) = α (for a right-tailed unilateral test). For a bilateral test, we have P(Zcal < − zc) = P(Zcal > zc) = α/2. Therefore, for a bilateral test, the null hypothesis is rejected if Zcal < − zc or Zcal > zc.

Unilateral probabilities associated to the Zcal (P1 = P) statistic can also be obtained from Table E. For a bilateral test, this probability must be doubled (P = 2P1). Thus, the null hypothesis is rejected if P ≤ α.

Example 10.9

Applying the Mann-Whitney U Test to Small Samples

Aiming at assessing the quality of two machines, the diameters of the parts produced (in mm) in each one of them are compared, as shown in Table 10.E.14. Use the most suitable test, at a significance level of 5%, to test if both samples come from or do not come from populations with the same medians.

Table 10.E.14

Diameter of Parts Produced in Two Machines
Mach. A48.5048.6548.5848.5548.6648.6448.5048.72
Mach. B48.7548.6448.8048.8548.7848.7949.20

Unlabelled Table

Solution

  • Step 1: By applying the normality test to both samples, we can see that the data from machine B do not follow a normal distribution. So, the most suitable test to compare the medians of two independent populations is the Mann-Whitney U test.
  • Step 2: Through the null hypothesis, the median diameters of the parts in both machines are the same, so:

H0: μA = μB

H1: μA ≠ μB

  • Step 3: The significance level to be considered is 5%.
  • Step 4: Calculation of the U statistic:
  1. (a) N1 = 7 (sample size from machine B)

N2 = 8 (sample size from machine A)

  1. (b) Combined sample and respective ranks (Table 10.E.15):
  1. (c) R1 = 80.5 (sum of the ranks from machine B with the smallest number of observations); R2 = 39.5 (sum of the ranks from machine A with the largest number of observations).
  2. (d) Calculation of U1 and U2:

U1=N1N2+N1N1+12R1=78+78280.5=3.5

si36_e

U2=N1N2+N2N2+12R2=78+89239.5=52.5

si37_e

  1. (e) Calculation of the Mann-Whitney U statistic:

Ucal=minU1U2=3.5

si38_e

  • Step 5: According to Table J, in the Appendix, for N1 = 7, N2 = 8, and P(Ucal < Uc) = α/2 = 0.025 (bilateral test), the critical value of the Mann-Whitney U statistic is Uc = 10.
  • Step 6: Decision: since the calculated statistic is in the critical region, that is, Ucal < 10, the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the medians of both populations are different.

Table 10.E.15

Combined Data
DataMachineRanks
48.50A1.5
48.50A1.5
48.55A3
48.58A4
48.64A5.5
48.64B5.5
48.65A7
48.66A8
48.72A9
48.75B10
48.78B11
48.79B12
48.80B13
48.85B14
49.20B15

If we use P-value instead of the critical value of the statistic, Steps 5 and 6 will be:

  • Step 5: According to Table J, in the Appendix, unilateral probability P1 associated to Ucal = 3.5, for N1 = 7 and N2 = 8, is less than 0.005. For a bilateral test, this probability must be doubled (P < 0.01).
  • Step 6: Decision: since P < 0.05, we must reject H0.

Example 10.10

Applying the Mann-Whitney U Test to Large Samples

As described previously, as the sample size grows (N2 > 20), the Mann-Whitney distribution becomes more similar to a standard normal distribution. Even though the data in Example 10.9 represent a small sample (N2 = 8), which would be the value of z in this case, by using Expression (10.13)? Interpret the result.

Solution

Zcal=UN1N2/2N1N2NN1N3N12j=1gtj3j=1gtj12=3.578/27815141531512164122.840

si39_e

The critical value of the zc statistic for a bilateral test, at the significance level of 5%, is − 1.96 (see Table E in the Appendix). Since Zcal < − 1.96, the null hypothesis would also be rejected by the Z statistic, which allows us to conclude, with a 95% confidence level, that the population medians are different.

Instead of comparing the value calculated to the critical value, we could obtain the value of P-value directly from Table E. Thus, the unilateral probability associated to statistic Zcal = − 2.840 is P1 = 0.0023. For a bilateral test, this probability must be doubled (P-value = 0.0046).

10.4.2.1 Solving the Mann-Whitney Test Using SPSS Software

The use of the images in this section has been authorized by the International Business Machines Corporation©.

The data in Example 10.9 are available in the file Mann-Whitney_Test.sav. Since group 1 is the one with the smallest number of observations, in Data → Define Variable Properties …, we assign value 1 to group B and value 2 to group A for variable Machine.

In order to elaborate the Mann-Whitney test on SPSS, we must click on Analyze → Nonparametric Tests → Legacy Dialogs → 2 Independent Samples …, as shown in Fig. 10.40.

Fig. 10.40
Fig. 10.40 Procedure to elaborate the Mann-Whitney test on SPSS.

After that, we should insert the variable Diameter in the box Test Variable List and the variable Machine in Grouping Variable, defining the respective groups. Let’s select the option Mann-Whitney U in Test Type, as shown in Fig. 10.41.

Fig. 10.41
Fig. 10.41 Selecting the variables and Mann-Whitney test.

Finally, let’s click on OK to obtain Figs. 10.42 and 10.43. Fig. 10.42 shows the mean and the sum of the ranks for each group, while Fig. 10.43 shows the statistic of the test.

Fig. 10.42
Fig. 10.42 Ranks.
Fig. 10.43
Fig. 10.43 Mann-Whitney test for Example 10.9 on SPSS.

The results in Fig. 10.42 are similar to the ones calculated in Example 10.9. According to Fig. 10.43, the result of the Mann-Whitney U statistic is 3.50, similar to the value calculated in Example 10.9. The bilateral probability associated to the U statistic is P = 0.002 (we saw in Example 10.9 that this probability is less than 0.01). For the same data in Example 10.9, if we had to calculate the Z statistic and the respective associated bilateral probability, the result would be Zcal = − 2.840 and P = 0.005, similar to the values calculated in Example 10.10. For both tests, as the associated bilateral probability is less than 0.05, the null hypothesis is rejected, which allows us to conclude that the medians of both populations are different.

10.4.2.2 Solving the Mann-Whitney Test Using Stata Software

The use of the images presented in this section has been authorized by Stata Corp LP©.

The Mann-Whitney test is elaborated on Stata from the command ranksum (equality test for nonpaired data), by using the following syntax:

ranksum variable⁎, by (groups⁎)

where the term variable⁎ must be replaced by the quantitative variable studied and the term groups⁎ by the categorical variable that represents the groups.

Let’s open the file Mann-Whitney_Test.dta that contains the data from Examples 10.9 and 10.10. Both groups are represented by the variable machine and the quality characteristic by the variable diameter. Thus, the command to be typed is:

ranksum diameter, by (machine)

The results obtained are shown in Fig. 10.44. We can see that the calculated value of the statistic (2.840) corresponds to the value calculated in Example 10.10, for large samples, from Expression (10.13). The probability associated to the statistic for a bilateral test is 0.0045. Since P < 0.05, we must reject the null hypothesis, which allows us to conclude, with a 95% confidence level, that the population medians are different.

Fig. 10.44
Fig. 10.44 Results of the Mann-Whitney test for Examples 10.9 and 10.10 on Stata.

10.5 Tests for k Paired Samples

These tests analyze the differences between k (three or more) paired or related samples. According to Siegel and Castellan (2006), the null hypothesis to be tested is that k samples have been drawn from the same population. The main tests for k paired samples are Cochran’s Q test (for binary variables) and Friedman’s test (for ordinal variables).

10.5.1 Cochran’s Q Test

Cochran’s Q test for k paired samples is an extension of the McNemar test for two samples, and it aims to test the hypothesis that the frequency in which or proportion of three or more related groups differ significantly from one another. In the same way as in the McNemar test, the data are binary.

According to Siegel and Castellan (2006), Cochran’s Q test compares the characteristics of several individuals or characteristics of the same individual observed under different conditions. For example, we can analyze if k items differ significantly for N individuals. Or, we may have only one item to analyze and the objective is to compare the answer of N individuals under k different conditions.

Let’s suppose that the study data are organized in one table with N rows and k columns, in which N is the number of cases and k is the number of groups or conditions. Through the null hypothesis of Cochran’s Q test, there are no differences between the frequencies or proportions of success (p) of the k related groups, that is, the proportion of a desired answer (success) is the same in each column. Through the alternative hypothesis, there are differences between at least two groups, so:

  • H0: p1 = p2 = … = pk
  • H1: ∃(i,j) pi ≠ pj, i ≠ j

Cochran’s Q statistic is given by:

Qcal=kk1j=1kGjG¯2ki=1NLii=1NLi2=k1kj=1kGj2j=1kGj2ki=1NLii=1NLi2

si40_e  (10.14)

which approximately follows a χ2 distribution with k − 1 degrees of freedom, where:

  • Gj: the total number of successes in the jth column;
  • G¯si41_e: mean of the Gj;
  • Li: the total number of successes in the ith row.

The value calculated must be compared to the critical value of the χ2 distribution (Table D in the Appendix). This table provides the critical values of χc2 where P(χcal2 > χc2) = α (for a right-tailed unilateral test). If the value of the statistic is in the critical region, that is, if Qcal > χc2, we must reject H0. Otherwise, we do not reject H0.

The probability associated to the calculated value of the statistic (P-value) can also be obtained from Table D. In this case, the null hypothesis is rejected if P ≤ α; otherwise we do not reject H0.

Example 10.11

Applying Cochran’s Q Test

We are interested in assessing 20 consumers’ level of satisfaction regarding three supermarkets, trying to investigate if their clients are satisfied (score 1) or not (score 0) with the quality, variety and price of their products—for each supermarket. Check the hypothesis that the probability of receiving a good evaluation from clients is the same for all three supermarkets, considering a significance level of 10%. Table 10.E.16 shows the results of the evaluation.

Table 10.E.16

Results of the Evaluation for All Three Supermarkets
ConsumerABCLiLi2
111139
210124
301124
400000
511024
611139
700111
810124
911139
1000111
1100000
1211024
1310124
1411139
1501124
1601124
1711139
1811139
1900111
2000111
TotalG1 = 11G2 = 11G3 = 16i=120Li=38si1_ei=120Li2=90si2_e

Unlabelled Table

Solution

  • Step 1: The most suitable test to compare proportions of three or more paired groups is Cochran’s Q test.
  • Step 2: Through the null hypothesis, the proportion of successes (score 1) is the same for all three supermarkets. Through the alternative hypothesis, the proportion of satisfied clients differs for at least two supermarkets, so:

H0: p1 = p2 = p3

H1: ∃(i,j) pi ≠ pj, i ≠ j

  • Step 3: The significance level to be considered is 10%.
  • Step 4: The calculation of Cochran’s Q statistic from Expression (10.14), is given by:

Qcal=k1kj=1kGj2j=1kGj2ki=1NLii=1NLi2=313112+112+16238233890=4.167

si42_e

  • Step 5: The critical region (CR) of the χ2 distribution (Table D in the Appendix), considering α = 10% and ν = k − 1 = 2 degrees of freedom, is shown in Fig. 10.45.
    Fig. 10.45
    Fig. 10.45 Critical region of Example 10.11.
  • Step 6: Decision: since the value calculated is not in the critical region, that is, Qcal < 4.605, the null hypothesis is not rejected, which allows us to conclude, with a 90% level of confidence, that the proportion of satisfied clients is equal for all three supermarkets.

If we use P-value instead of the statistic’s critical value, Steps 5 and 6 will be:

  • Step 5: According to Table D, in the Appendix, for ν = 2 degrees of freedom, the probability associated to statistic Qcal = 4.167 is greater than 0.10 (P-value > 0.10).
  • Step 6: Decision: since P > 0.10, we should not reject H0.

10.5.1.1 Solving Cochran’s Q Test by Using SPSS Software

The use of the images in this section has been authorized by the International Business Machines Corporation©.

The data in Example 10.11 are available in the file Cochran_Q_Test.sav. The procedure for elaborating Cochran’s Q test on SPSS is shown. First of all, let’s click on Analyze → Nonparametric Tests → Legacy Dialogs → K Related Samples …, as shown in Fig. 10.46.

Fig. 10.46
Fig. 10.46 Procedure for elaborating Cochran’s Q test on SPSS.

After that, we must insert variables A, B, and C in the box Test Variables, and select option Cochran’s Q in Test Type, as shown in Fig. 10.47.

Fig. 10.47
Fig. 10.47 Selecting the variables and Cochran’s Q test.

Finally, let’s click on OK to obtain the results of the test. Fig. 10.48 shows the frequencies of each group and Fig. 10.49 shows the result of the statistic.

Fig. 10.48
Fig. 10.48 Frequencies.
Fig. 10.49
Fig. 10.49 Cochran’s Q test for Example 10.11 on SPSS.

The value of Cochran’s Q statistic is 4.167, similar to the value calculated in Example 10.11. The probability associated to the statistic is 0.125 (we saw in Example 10.11 that P > 0.10). Since P > α, the null hypothesis is not rejected, which allows us to conclude, with a 90% level of confidence, that there are no differences in the proportion of satisfied clients for all three supermarkets.

10.5.1.2 Solution of Cochran’s Q Test on Stata Software

The use of the images presented in this section has been authorized by Stata Corp LP©.

The data from Example 10.11 are also available in the file Cochran_Q_Test.dta. The command used to elaborate the test is cochran followed by the k paired variables. In our case, the variables that represent the three groups of supermarkets, a, b, and c, respectively. So, the command to be typed is:

cochran a b c

The results of Cochran’s Q test on Stata are in Fig. 10.50. We can verify that the result of the statistic and the respective associated probability are similar to the results calculated in Example 10.11, and also generated on SPSS, which allows us to conclude, with a 90% level of confidence, that the proportion of dissatisfied clients is the same for all three supermarkets.

Fig. 10.50
Fig. 10.50 Results of Cochran’s Q test for Example 10.11 on Stata.

10.5.2 Friedman’s Test

Friedman’s test is applied to quantitative or qualitative variables in an ordinal scale, and has as its main objective to verify if k paired samples are drawn from the same population. It is an extension of the Wilcoxon test for three or more paired samples. It is also an alternative to the analysis of variance when its hypotheses (normality of data and homogeneity of variances) are violated or when the sample size is too small.

The data are represented in one table with double entry, with N rows and k columns, in which the rows represent the several individuals or corresponding sets of individuals, and the columns represent the different conditions.

Therefore, the null hypothesis of Friedman’s test assumes that the k samples (columns) come from the same population or from populations with the same median (μ). For a bilateral test, we have:

  • H0: μ1 = μ2 = … = μk
  • H1: ∃(i,j) μi ≠ μj, i ≠ j

To apply Friedman’s statistic, we must attribute ranks from 1 to k to each element of each row. For example, position 1 is attributed to the lowest observation on the row and position N to the highest observation. If there are ties, we attribute the mean of the corresponding ranks. Friedman’s statistic is given by:

Fcal=12Nkk+1j=1kRj23Nk+1

si43_e  (10.15)

where:

  • N: the number of rows;
  • k: the number of columns;
  • Rj: sum of the ranks in column j.

However, according to Siegel and Castellan (2006), whenever there are ties between the ranks of the same group or row, Friedman’s statistic must be corrected in a way that considers the changes in the sample distribution, as follows:

Fcal=12j=1kRj23N2kk+12Nkk+1+Nki=1Nj=1gitij3k1

si44_e  (10.16)

where:

  • gi: the number of sets with tied observations in the ith group, including the sets of size 1;
  • tij: size of the jth set of ties in the ith group.

The value calculated must be compared to the critical value of the sample distribution. When N and k are small (k = 3 and 3 < N < 13, or k = 4 and 2 < N < 8, or k = 5 and 3 < N < 5), we must use Table K in the Appendix, which shows the critical values of Friedman’s statistic (Fc), where P(Fcal > Fc) = α (for a right-tailed unilateral test). For high values of N and k, the sample distribution can be approximated by the χ2 distribution with ν = k − 1 degrees of freedom.

Therefore, if the value of the Fcal statistic is in the critical region, that is, if Fcal > Fc for a small N and K or Fcal > χc2 for a high N and K, we must reject the null hypothesis. Otherwise, we do not reject H0.

Example 10.12

Applying Friedman’s Test

A research is carried out in order to verify the efficacy that breakfast has in weight loss and, in order to do that, 15 patients were followed up for 3 months. Data regarding patients’ weight were collected during three different periods, as shown in Table 10.E.17: before the treatment (BT), after the treatment (AT), and after 3 months of treatment (A3M). Check and see if the treatment had any results. Assume that α = 5%.

Table 10.E.17

Patients’ Weight in Each Period
PatientPeriod
BTATA3M
1656258
2898580
3969595
4908479
5707066
6726562
7878477
8747469
9666462
10135132132
11827571
12767367
13949088
14808077
15737068

Unlabelled Table

Solution

  • Step 1: Since the data do not follow a normal distribution, Friedman’s test is an alternative to ANOVA to verify if the three paired samples are drawn from the same population.
  • Step 2: Through the null hypothesis, there is no difference among the treatments. Through the alternative hypothesis, the treatment had some results, so:

H0: μ1 = μ2 = μ3

H1: ∃(i,j) μi ≠ μj, i ≠ j

  • Step 3: The significance level to be considered is 5%.
  • Step 4: In order to calculate Friedman’s statistic, first, we must attribute ranks from 1 to 3 to each element in each row, as shown in Table 10.E.18. If there are ties, we attribute the mean of the corresponding ranks.

Table 10.E.18

Attributing Ranks
PatientPeriod
BTATA3M
1321
2321
331.51.5
4321
52.52.51
6321
7321
82.52.51
9321
1031.51.5
11321
12321
13321
142.52.51
15321
Rj43.530.516
Mean of the ranks2.9002.0301.067

Unlabelled Table

As shown in Table 10.E.18, there are two ties in patient 3, two in patient 5, two in patient 8, two in patient 10, and two in patient 14. Therefore, the total number of size 2 ties is 5 and the total number of size 1 ties is 35. Thus:

i=1Nj=1gitij3=35×1+5×23=75

si45_e

Since there are ties, the real value of Friedman’s statistic is calculated from Expression (10.16), as follows:

Fcal=12j=1kRj23N2kk+12Nkk+1+Nki=1Nj=1gitij3k1=1243.52+30.52+16231523421534+153752

si46_e

Fcal=27.527

si47_e

If we applied Expression (10.15) without the correction factor, the result of Friedman’s test would be 25.233.

  • Step 5: Since k = 3 and N = 15, let’s use the χ2 distribution. The critical region (CR) of the χ2 distribution (Table D in the Appendix), considering α = 5% and ν = k − 1 = 2 degrees of freedom, is shown in Fig. 10.51.
    Fig. 10.51
    Fig. 10.51 Critical region of Example 10.12.
  • Step 6: Decision: since the value calculated is in the critical region, that is, Fcal > 5.991, we reject the null hypothesis, which allows us to conclude, with a 95% confidence level, that the treatment has good results.

If we use P-value instead of the critical value of the statistic, Steps 5 and 6 will be:

  • Step 5: According to Table D in the Appendix, for ν = 2 degrees of freedom, the probability associated to statistic Fcal = 27.527 is less than 0.005 (P-value < 0.005).
  • Step 6: Decision: since P < 0.005, we must reject H0.

10.5.2.1 Solving Friedman’s Test by Using SPSS Software

The use of the images in this section has been authorized by the International Business Machines Corporation©.

The data from Example 10.12 are available in the file Friedman_Test.sav. To elaborate Friedman’s test on SPSS, let’s first click on Analyze → Nonparametric Tests → Legacy Dialogs → K Related Samples …, as shown in Fig. 10.52.

Fig. 10.52
Fig. 10.52 Procedure for elaborating Friedman’s test on SPSS.

After that, we must insert variables BT, AT, and A3M in the box Test Variables and select the option Friedman in Test Type, as shown in Fig. 10.53.

Fig. 10.53
Fig. 10.53 Selecting the variables and Friedman’s test.

Finally, let’s click on OK to obtain the results of Friedman’s test. Fig. 10.54 shows the means of the ranks, similar to the values calculated in Table 10.E.18.

Fig. 10.54
Fig. 10.54 Mean of the ranks.

The value of Friedman’s statistic and the significance level of the test are in Fig. 10.55.

Fig. 10.55
Fig. 10.55 Results of Friedman’s test for Example 10.12 on SPSS.

The value of the test is 27.527, similar to the one calculated in Example 10.12. The probability associated to the statistic is 0.000 (we saw in Example 10.12 that this probability is less than 0.005). Since P < 0.05, we reject the null hypothesis, which allows us to conclude, with a 95% confidence level, that the treatment has good results.

10.5.2.2 Solving Friedman’s Test by Using Stata Software

The use of the images presented in this section has been authorized by Stata Corp LP©.

The data in Example 10.12 are available in the file Friedman_Test.dta. The variables being studied are bt, at, and a3m. Friedman’s test on Stata is elaborated from the command friedman. In order for this command to be used, first, we must type:

findit friedman

and install it on the link snp2 from http://www.stata.com/stb/stb3.

Elaborating Friedman’s test on Stata requires that the data be transposed. However, before transposing them, we can preserve the original dataset, typing preserve.

After that, let’s type the command xpose that transposes all the variables into observations and all the observations into variables:

xpose, clear

After the command xpose, we can see that the data were transformed into n variables (number of initial observations). Let’s now type the following command:

friedman v1–v15

since the current dataset now contains 15 variables after the transposition. Through Fig. 10.56, we can verify that Friedman’s statistic on Stata (25.233) is calculated from Expression (10.15), without the correction factor. The probability associated to the statistic is 0.000 (the null hypothesis is rejected), which allows us to conclude, with a 95% confidence level, that there are differences between the treatments. To restore the original dataset, we must type restore.

Fig. 10.56
Fig. 10.56 Results of Friedman’s test for Example 10.12 on Stata.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset