In Example 7.4, we compared two methods of teaching reading to “slow learners” by means of a 95% confidence interval. Suppose it is possible to measure the “reading IQs” of the “slow learners” before they are subjected to a teaching method. Eight pairs of “slow learners” with similar reading IQs are found, and one member of each pair is randomly assigned to the standard teaching method while the other is assigned to the new method. The data are given in Table 7.3. Do the data support the hypothesis that the population mean reading test score for “slow learners” taught by the new method is greater than the mean reading test score for those taught by the standard method?
Pair | New Method (1) | Standard Method (2) |
---|---|---|
1 | 77 | 72 |
2 | 74 | 68 |
3 | 82 | 76 |
4 | 73 | 68 |
5 | 87 | 84 |
6 | 69 | 68 |
7 | 66 | 61 |
8 | 80 | 76 |
Data Set: PAIREDSCORES
We want to test
Many researchers mistakenly use the t statistic for two independent samples (Section 7.2) to conduct this test. This invalid analysis is shown on the MINITAB printout of Figure 7.10. The test statistic, t=1.26,
If you examine the data in Table 7.3 carefully, however, you will find this result difficult to accept. The test score of the new method is larger than the corresponding test score for the standard method for every one of the eight pairs of “slow learners.” This, in itself, seems to provide strong evidence to indicate that μ1
The t-test is inappropriate because the assumption of independent samples is invalid. We have randomly chosen pairs of test scores; thus, once we have chosen the sample for the new method, we have not independently chosen the sample for the standard method. The dependence between observations within pairs can be seen by examining the pairs of test scores, which tend to rise and fall together as we go from pair to pair. This pattern provides strong visual evidence of a violation of the assumption of independence required for the two-sample t-test of Section 7.2. Note also that
Hence, there is a large variation within samples (reflected by the large value of s2p
Pair | New Method | Standard Method | Difference (New Method − Standard Method) |
---|---|---|---|
1 | 77 | 72 | 5 |
2 | 74 | 68 | 6 |
3 | 82 | 76 | 6 |
4 | 73 | 68 | 5 |
5 | 87 | 84 | 3 |
6 | 69 | 68 | 1 |
7 | 66 | 61 | 5 |
8 | 80 | 76 | 4 |
We now consider a valid method of analyzing the data of Table 7.3. In Table 7.4, we add the column of differences between the test scores of the pairs of “slow learners.” We can regard these differences in test scores as a random sample of differences for all pairs (matched on reading IQ) of “slow learners,” past and present. Then we can use this sample to make inferences about the mean of the population of differences, μd,
The test statistic is a one-sample t (Section 8.4), since we are now analyzing a single sample of differences for small n. Thus,
where
Assumptions: The population of differences in test scores is approximately normally distributed. The sample differences are randomly selected from the population differences. [Note: We do not need to make the assumption that σ21=σ22.
Rejection region: At significance level α=.05,
Referring to Table II in Appendix B, we find the t-value corresponding to α=.05
Summary statistics for the nd=8
Because this value of t falls into the rejection region, we conclude (at α=.05
Now Work Exercises 7.35a and b
This kind of experiment, in which observations are paired and the differences are analyzed, is called a paired difference experiment. In many cases, a paired difference experiment can provide more information about the difference between population means than an independent samples experiment can. The idea is to compare population means by comparing the differences between pairs of experimental units (objects, people, etc.) that were similar prior to the experiment. The differencing removes sources of variation that tend to inflate σ2.
Some other examples for which the paired difference experiment might be appropriate are the following:
Suppose you want to estimate the difference (μ1−μ2)
Suppose a college placement center wants to estimate the difference (μ1−μ2)
Suppose you wish to estimate the difference (μ1−μ2)
Now Work Exercise 7.35
The hypothesis-testing procedures and the method of forming confidence intervals for the difference between two means in a paired difference experiment are summarized in the following boxes for both large and small n:
Large Sample, Normal (z) Statistic
Small Sample, Student’s t-Statistic
where tα/2
One-Tailed Tests | Two-Tailed Test | ||
---|---|---|---|
H0:μd=D0 |
H0:μd=D0 |
H0:μd=D0 |
|
Ha:μd<D0 |
Ha:μd>D0 |
Ha:μd≠D0 |
|
Large Sample, Normal (z) Test Statistic: zc=(ˉxd−D0)(σd/√nd)≈(ˉxd−D0)(sd/√nd) |
|||
Rejection region: | zc<−zα |
zc>zα |
|zc|>zα/2 |
p-value: | P(z<zc) |
P(z>zc) |
2P(z>zc) |
2P(z<zc) |
|||
Small Sample, Student’s t-Test Statistic: tc=(ˉxd−D0)(sd/√nd) |
|||
Rejection region: | tc<−tα |
tc>tα |
|tc|>tα/2 |
p-value: | P(t<tc) |
P(t>tc) |
2P(t>tc) |
2P(t<tc) |
|||
Decision: Reject H0 |
[Note: The symbol for the numerical value assigned to the difference μd
A random sample of differences is selected from the target population of differences.
The sample size nd
A random sample of differences is selected from the target population of differences.
The population of differences has a distribution that is approximately normal.
An experiment is conducted to compare the starting salaries of male and female college graduates who find jobs. Pairs are formed by choosing a male and a female with the same major and similar grade point averages (GPAs). Suppose a random sample of 10 pairs is formed in this manner and the starting annual salary of each person is recorded. The results are shown in Table 7.5. Compare the mean starting salary μ1
Pair | Male | Female | Difference Male−Female |
---|---|---|---|
1 | $29,300 | $28,800 | $ 500 |
2 | 41,500 | 41,600 | −100 |
3 | 40,400 | 39,800 | 600 |
4 | 38,500 | 38,500 | 0 |
5 | 43,500 | 42,600 | 900 |
6 | 37,800 | 38,000 | −200 |
7 | 69,500 | 69,200 | 300 |
8 | 41,200 | 40,100 | 1,100 |
9 | 38,400 | 38,200 | 200 |
10 | 59,200 | 58,500 | 700 |
Data Set: GRADS
Since the data on annual salary are collected in pairs of males and females matched on GPA and major, a paired difference experiment is performed. To conduct the analysis, we first compute the differences between the salaries, as shown in Table 7.5. Summary statistics for these n=10
The 95% confidence interval for μd=(μ1−μ2)
where tα/2=t.025=2.262
[Note: This interval is also shown highlighted at the bottom of the SAS printout of Figure 7.13.] Our interpretation is that the true mean difference between the starting salaries of males and females falls between $89 and $711, with 95% confidence. Since the interval falls above 0, we infer that μ1−μ2>0;
Remember that μd=μ1−μ2.
Now Work Exercise 7.41
To measure the amount of information about (μ1−μ2)
where
SPSS performed these calculations and obtained the interval ($−10,537.50, $11,337.50),
Notice that the independent samples interval includes 0. Consequently, if we were to use this interval to make an inference about (μ1−μ2),
You may wonder whether a paired difference experiment is always superior to an independent samples experiment. The answer is, most of the time, but not always. We sacrifice half the degrees of freedom in the t-statistic when a paired difference design is used instead of an independent samples design. This is a loss of information, and unless that loss is more than compensated for by the reduction in variability obtained by blocking (pairing), the paired difference experiment will result in a net loss of information about (μ1−μ2).
In a two-group analysis, intentionally pairing observations after the data have been collected in order to produce a desired result is considered unethical statistical practice.
One final note: The pairing of the observations is determined before the experiment is performed (i.e., by the design of the experiment). A paired difference experiment is never obtained by pairing the sample observations after the measurements have been acquired.
Answer: Use the nonparametric Wilcoxon signed rank test for the paired difference design. (see optional Section 7.6.)
7.29 In a paired difference experiment, when should the observations be paired, before or after the data are collected?
7.30 What are the advantages of using a paired difference experiment over an independent samples design?
7.31 True or False. In a paired difference experiment, ‾xd=‾x1−‾x2
7.32 What conditions are required for valid large-sample inferences about μd
7.33 A paired difference experiment yielded nd
nd=10,α=.05
nd=20,α=.10
nd=5,α=.025
nd=9,α=.01
7.34 A paired difference experiment produced the following data:
Determine the values of t for which the null hypothesis μ1−μ2=0
Conduct the paired difference test described in part a. Draw the appropriate conclusions.
What assumptions are necessary so that the paired difference test will be valid?
Find a 90% confidence interval for the mean difference μd.
Which of the two inferential procedures, the confidence interval of part d or the test of hypothesis of part b, provides more information about the difference between the population means?
L07035 7.35 The data for a random sample of six paired observations are shown in the following table.
Pair | Sample from Population 1 | Sample from Population 2 |
---|---|---|
1 | 7 | 4 |
2 | 3 | 1 |
3 | 9 | 7 |
4 | 6 | 2 |
5 | 4 | 4 |
6 | 8 | 7 |
Calculate the difference between each pair of observations by subtracting observation 2 from observation 1. Use the differences to calculate ‾xd
If μ1
Form a 95% confidence interval for μd.
Test the null hypothesis H0:μd=0
L07036 7.36 The data for a random sample of 10 paired observations are shown in the following table.
Pair | Population 1 | Population 2 |
---|---|---|
1 | 19 | 24 |
2 | 25 | 27 |
3 | 31 | 36 |
4 | 52 | 53 |
5 | 49 | 55 |
6 | 34 | 34 |
7 | 59 | 66 |
8 | 47 | 51 |
9 | 17 | 20 |
10 | 51 | 55 |
If you wish to test whether these data are sufficient to indicate that the mean for population 2 is larger than that for population 1, what are the appropriate null and alternative hypotheses? Define any symbols you use.
Conduct the test from part a, using α=.10.
Find a 90% confidence interval for μd.
What assumptions are necessary to ensure the validity of the preceding analysis?
7.37 A paired difference experiment yielded the following results:
Test H0:μd=10
Report the p-value for the test you conducted in part a. Interpret the p-value.
7.38 Summer weight-loss camp. Camp Jump Start is an 8-week summer camp for overweight and obese adolescents. Counselors develop a weight-management program for each camper that centers on nutrition education and physical activity. In a study published in Pediatrics (Apr. 2010), the body mass index (BMI) was measured for each of 76 campers both at the start and end of camp. Summary statistics on BMI measurements are shown in the table.
Mean | Standard Deviation | |
---|---|---|
Starting BMI | 34.9 | 6.9 |
Ending BMI | 31.6 | 6.2 |
Paired Differences | 3.3 | 1.5 |
Based on Huelsing, J., Kanafani, N., Mao, J., and White, N. H. “Camp Jump Start: Effects of a residential summer weight-loss camp for older children and adolescents.” Pediatrics, Vol. 125, No. 4, Apr. 2010 (Table 3).
Give the null and alternative hypothesis for determining whether the mean BMI at the end of camp is less than the mean BMI at the start of camp.
How should the data be analyzed, as an independent-samples t-test or as a paired-difference t-test? Explain.
Calculate the test statistic using the formula for an independent-samples t-test. (Note: This is not how the test should be conducted.)
Calculate the test statistic using the formula for a paired-difference t-test.
The p-value of the test, part d, was reported as p<.0001. Interpret this result assuming α=.01.
Do the differences in BMI values need to be normally distributed in order for the inference, part f, to be valid? Explain.
Find a 99% confidence interval for the true mean change in BMI for Camp Jump Start campers. Interpret the result.
7.39 Packaging of a children’s health food. Refer to the Journal of Consumer Behaviour (Vol. 10, 2011) study of packaging of a children’s health food product, Exercise 8.42 (p. 391). Recall that a fictitious brand of a healthy food product—sliced apples—was packaged to appeal to children (a smiling cartoon apple on the front of the package). The researchers compared the appeal of this fictitious brand to a commercially available brand of sliced apples that was not packaged for children. Each of 408 schoolchildren rated both brands on a 5-point “willingness to eat” scale, with 1=“not willing a tall” and 5=“very willing.” The fictitious brand had a sample mean score of 3.69, while the commercially available brand had a sample mean score of 3.00. The researchers wanted to compare the population mean score for the fictitious brand, μF, to the population mean score for the commercially available brand, μC. They theorized that μF will be greater than μC.
Specify the null and alternative hypothesis for the test.
Explain how the researchers should analyze the data and why.
The researchers reported a test statistic value of 5.71. Interpret this result. Use α=.05 to draw your conclusion.
Find the approximate p-value of the test.
Could the researchers have tested at α=.01 and arrived at the same conclusion?
7.40 DRILL2 Twinned drill holes. A traditional method of verifying mineralization grades in mining is to drill twinned holes, i.e., the drilling of a new hole, or “twin,” next to an earlier drillhole. The use of twinned drill holes was investigated in Exploration and Mining Geology (Vol. 18, 2009). Geologists use data collected at both holes to estimate the total amount of heavy minerals (THM) present at the drilling site. The data in the next table (based on information provided in the journal article) represent THM percentages for a sample of 15 twinned holes drilled at a diamond mine in Africa. The geologists want to know if there is any evidence of a difference in the true THM means of all original holes and their twin holes drilled at the mine.
Explain why the data should be analyzed as paired differences.
Compute the difference between the “1st hole” and “2nd hole” measurements for each drilling location.
Find the mean and standard deviation of the differences, part b.
Use the summary statistics, part c, to find a 90% confidence interval for the true mean difference (“1st hole” minus “2nd hole”) in THM measurements.
Interpret the interval, part d. Can the geologists conclude that there is no evidence of a difference in the true THM means of all original holes and their twin holes drilled at the mine?
Location | 1st Hole | 2nd Hole |
---|---|---|
1 | 5.5 | 5.7 |
2 | 11.0 | 11.2 |
3 | 5.9 | 6.0 |
4 | 8.2 | 5.6 |
5 | 10.0 | 9.3 |
6 | 7.9 | 7.0 |
7 | 10.1 | 8.4 |
8 | 7.4 | 9.0 |
9 | 7.0 | 6.0 |
10 | 9.2 | 8.1 |
11 | 8.3 | 10.0 |
12 | 8.6 | 8.1 |
13 | 10.5 | 10.4 |
14 | 5.5 | 7.0 |
15 | 10.0 | 11.2 |
MUSEUM 7.41 Healing potential of handling museum objects. Does handling a museum object have a positive impact on a sick patient’s well-being? To answer this question, researchers at the University College London collected data from 32 sessions with hospital patients (Museum & Society, Nov. 2009). Each patient’s health status (measured on a 100-point scale) was recorded both before and after handling museum objects such as archaeological artifacts and brass etchings. The data (simulated) are listed in the accompanying table.
Session | Before | After |
---|---|---|
1 | 52 | 59 |
2 | 42 | 54 |
3 | 46 | 55 |
4 | 42 | 51 |
5 | 43 | 42 |
6 | 30 | 43 |
7 | 63 | 79 |
8 | 56 | 59 |
9 | 46 | 53 |
10 | 55 | 57 |
11 | 43 | 49 |
12 | 73 | 83 |
13 | 63 | 72 |
14 | 40 | 49 |
15 | 50 | 49 |
16 | 50 | 64 |
17 | 65 | 65 |
18 | 52 | 63 |
19 | 39 | 50 |
20 | 59 | 69 |
21 | 49 | 61 |
22 | 59 | 66 |
23 | 57 | 61 |
24 | 56 | 58 |
25 | 47 | 55 |
26 | 61 | 62 |
27 | 65 | 61 |
28 | 36 | 53 |
29 | 50 | 61 |
30 | 40 | 52 |
31 | 65 | 70 |
32 | 59 | 72 |
Explain why the data should be analyzed as paired differences.
Compute the difference between the “before” and “after” measurements for each session.
Find the mean and standard deviation of the differences, part b.
Use the summary statistics, part c, to find a 90% confidence interval for the true mean difference (“before” minus “after”) in health status scale measurements.
Interpret the interval, part d. Does handling a museum object have a positive impact on a sick patient’s well-being?
7.42 Laughter among deaf signers. The Journal of Deaf Studies and Deaf Education (Fall 2006) published an article on vocalized laughter among deaf users of American Sign Language (ASL). In videotaped ASL conversations among deaf participants, 28 laughed at least once. The researchers wanted to know if they laughed more as speakers (while signing) or as audience members (while listening). For each of the 28 deaf participants, the number of laugh episodes as a speaker and the number of laugh episodes as an audience member were determined. One goal of the research was to compare the mean numbers of laugh episodes of speakers and audience members.
Explain why the data should be analyzed as a paired difference experiment.
Identify the study’s target parameter.
The study yielded a sample mean of 3.4 laughter episodes for speakers and a sample mean of 1.3 laughter episodes for audience members. Is this sufficient evidence to conclude that the population means are different? Explain.
A paired difference t-test resulted in t=3.14 and p-value<.01. Interpret the results in the words of the problem.
7.43 The placebo effect and pain. According to research published in Science (Feb. 20, 2004), the mere belief that you are receiving an effective treatment for pain can reduce the pain you actually feel. Researchers tested this placebo effect on 24 volunteers as follows: Each volunteer was put inside a magnetic resonance imaging (MRI) machine for two consecutive sessions. During the first session, electric shocks were applied to their arms and the blood oxygen level–dependent (BOLD) signal (a measure related to neural activity in the brain) was recorded during pain. The second session was identical to the first, except that, prior to applying the electric shocks, the researchers smeared a cream on the volunteer’s arms. The volunteers were informed that the cream would block the pain when, in fact, it was just a regular skin lotion (i.e., a placebo). If the placebo is effective in reducing the pain experience, the BOLD measurements should be higher, on average, in the first MRI session than in the second.
Identify the target parameter for this study.
What type of design was used to collect the data?
Give the null and alternative hypotheses for testing the placebo effect theory.
The differences between the BOLD measurements in the first and second sessions were computed and summarized in the study as follows: nd=24,‾xd=.21,sd=.47. Use this information to calculate the test statistic.
The p-value of the test was reported as p-value=.02. Make the appropriate conclusion at α=.05.
7.44 SHALLOW Settlement of shallow foundations. Structures built on a shallow foundation (e.g., a concrete slab-on-grade foundation) are susceptible to settlement. Consequently, accurate settlement prediction is essential in the design of the foundation. Several methods for predicting settlement of shallow foundations on cohesive soil were compared in Environmental & Engineering Geoscience (Nov. 2012). Settlement data for a sample of 13 structures built on a shallow foundation were collected. The actual settlement values (measured in millimeters) for each structure were compared to settlement predictions made using a formula that accounts for dimension, rigidity, and embedment depth of the foundation. The data are listed in the table.
Structure | Actual | Predicted |
---|---|---|
1 | 11 | 11 |
2 | 11 | 11 |
3 | 10 | 12 |
4 | 8 | 6 |
5 | 11 | 9 |
6 | 9 | 10 |
7 | 9 | 9 |
8 | 39 | 51 |
9 | 23 | 24 |
10 | 269 | 252 |
11 | 4 | 3 |
12 | 82 | 68 |
13 | 250 | 264 |
Source: Ozur, M. “Comparing methods for predicting immediate settlement of shallow foundations on cohesive soils based on hypothetical and real cases.” Environmental & Engineering Geoscience, Vol. 18, No. 4, Nov. 2012 (from Table 4).
What type of design was employed to collect the data?
Use the information in the accompanying SAS printout to construct a 99% confidence interval for the mean difference between actual and predicted settlement value. Give a practical interpretation of the interval.
Explain the meaning of “99% confidence” for this application.
SOLAR 7.45 Solar energy generation along highways. The potential of using solar panels constructed above national highways to generate energy was explored in the International Journal of Energy and Environmental Engineering (Dec. 2013). Two-layer solar panels (with 1 meter separating the panels) were constructed above sections of both east-west and north-south highways in India. The amount of energy (kilowatt hours) supplied to the country’s grid by the solar panels above the two types of highways was determined each month. The data for several randomly selected months are provided in the table. The researchers concluded that the “two-layer solar panel energy generation is more viable for the north-south oriented highways as compared to east-west oriented roadways.” Do you agree?
Month | East-West | North-South |
---|---|---|
February | 8658 | 8921 |
April | 7930 | 8317 |
July | 5120 | 5274 |
September | 6862 | 7148 |
October | 8608 | 8936 |
Source: Sharma, P., and Harinarayana, T. “Solar energy generation potential along national highways.” International Journal of Energy and Environmental Engineering, Vol. 49, No. 1, Dec. 2013 (Table 3).
SKIN 7.46 Estimating well scale deposits. Scale deposits can cause a serious reduction in the flow performance of a well. A study published in the Journal of Petroleum and Gas Engineering (Apr. 2013) compared two methods of estimating the damage from scale deposits (called skin factor). One method of estimating the well skin factor uses a series of Excel spreadsheets, while the second method employs EPS computer software. Skin factor data was obtained from applying both methods to 10 randomly selected oil wells: 5 vertical wells and 5 horizontal wells. The results are supplied in the accompanying table.
Compare the mean skin factor values for the two estimation methods using all 10 sampled wells. Test at α=.05. What do you conclude?
Repeat part a, but analyze the data for the 5 horizontal wells only.
Repeat part a, but analyze the data for the 5 vertical wells only.
Well (Type) | Excel Spreadsheet | EPS Software |
---|---|---|
1 (Horizontal) | 44.48 | 37.77 |
2 (Horizontal) | 18.34 | 13.31 |
3 (Horizontal) | 19.21 | 7.02 |
4 (Horizontal) | 11.70 | 4.77 |
5 (Horizontal) | 9.25 | 1.96 |
6 (Vertical) | 317.40 | 281.74 |
7 (Vertical) | 181.44 | 192.16 |
8 (Vertical) | 154.65 | 140.84 |
9 (Vertical) | 77.43 | 56.86 |
10 (Vertical) | 49.37 | 45.01 |
Source: Rahuma, K. M., et al. “Comparison between spreadsheet and specialized programs in calculating the effect of scale deposition on the well flow performance.” Journal of Petroleum and Gas Engineering, Vol. 4, No. 4, Apr. 2013 (Table 2).
MAWASH 7.47 Acidity of mouthwash. Acid has been found to be a primary cause of dental caries (cavities). It is theorized that oral mouthwashes contribute to the development of caries due to the antiseptic agent oxidizing into acid over time. This theory was tested in the Journal of Dentistry, Oral Medicine and Dental Education (Vol. 3, 2009). Three bottles of mouthwash, each of a different brand, were randomly selected from a drugstore. The pH level (where lower pH levels indicate higher acidity) of each bottle was measured on the date of purchase and after 30 days. The data are shown in the next table. Conduct an analysis to determine if the mean initial pH level of mouthwash differs significantly from the mean pH level after 30 days. Use α=.05 as your level of significance.
Mouthwash Brand | Initial pH | Final pH |
---|---|---|
LMW | 4.56 | 4.27 |
SMW | 6.71 | 6.51 |
RMW | 5.65 | 5.58 |
Based on Chunhye, K. L., and Schmitz, B. C., “Determination of pH, total acid, and total ethanol in oral health products: Oxidation of ethanol and recommendations to mitigate its association with dental caries.” Journal of Dentistry, Oral Medicine and Dental Education, Vol. 3, No. 1, 2009 (Table 1).
7.48 Visual search and memory study. In searching for an item (e.g., a roadside traffic sign, a lost earring, or a tumor in a mammogram), common sense dictates that you will not reexamine items previously rejected. However, researchers at Harvard Medical School found that a visual search has no memory (Nature, Aug. 6, 1998). In their experiment, nine subjects searched for the letter “T” mixed among several letters “L.” Each subject conducted the search under two conditions: random and static. In the random condition, the locations of the letters were changed every 111 milliseconds; in the static condition, the locations of the letters remained unchanged. In each trial, the reaction time in milliseconds (i.e., the amount of time it took the subject to locate the target letter) was recorded.
One goal of the research was to compare the mean reaction times of subjects in the two experimental conditions. Explain why the data should be analyzed as a paired difference experiment.
If a visual search has no memory, then the main reaction times in the two conditions will not differ. Specify H0 and Ha for testing the “no-memory” theory.
The test statistic was calculated as t=1.52 with p-value=.15. Draw the appropriate conclusion.
DEMENT 7.49 Linking dementia and leisure activities. Does participation in leisure activities in your youth reduce the risk of Alzheimer’s disease and other forms of dementia? To answer this question, a group of university researchers studied a sample of 107 same-sex Swedish pairs of twins (Journal of Gerontology: Psychological Sciences and Social Sciences, Sept. 2003). Each pair of twins was discordant for dementia; that is, one member of each pair was diagnosed with Alzheimer’s disease while the other member (the control) was nondemented for at least five years after the sibling’s onset of dementia. The level of overall leisure activity (measured on an 80-point scale, where higher values indicate higher levels of leisure activity) of each twin of each pair 20 years prior to the onset of dementia was obtained from the Swedish Twin Registry database. The leisure activity scores (simulated on the basis of summary information presented in the journal article) are saved in the DEMENT file. The first five and last five observations are shown in the following table.
Pair | Control | Demented |
---|---|---|
1 | 27 | 13 |
2 | 57 | 57 |
3 | 23 | 31 |
4 | 39 | 46 |
5 | 37 | 37 |
⋮ | ⋮ | ⋮ |
103 | 22 | 14 |
104 | 32 | 23 |
105 | 33 | 29 |
106 | 36 | 37 |
107 | 24 | 1 |
Explain why the data should be analyzed as a paired difference experiment.
Conduct the appropriate analysis, using α=.05. Make an inference about which member of the pair, the demented or control (nondemented) twin, had the largest average level of leisure activity.
7.50 Ethical sensitivity of teachers toward racial intolerance. Many high schools have education programs that encourage teachers to embrace racial tolerance. To gauge the effectiveness of one such program that utilizes two videos of teachers engaging in racial stereotypes of their students, researchers recruited 238 high school professionals (including teachers and counselors) to participate in a study (Journal of Moral Education, Mar. 2010). Teachers watched the first video, then were given a pretest—the Quick-REST Survey—designed to measure ethical sensitivity toward racial intolerance. The teachers next participated in an all-day workshop on cultural competence. At the end of the workshop, the teachers watched the second video and again were given the Quick-REST Survey (the posttest). To determine whether the program was effective, the researchers compared the mean scores on the Quick-REST Survey using a paired-difference t-test. (Note: The higher the score on the Quick-REST Survey, the greater the level of racial tolerance.)
The researchers reported the sample means for the pretest and posttest as 75.85 and 80.35, respectively. Why is it dangerous to gauge the effectiveness of the program based only on these summary statistics?
The paired-difference t-test (posttest minus pretest) was reported as t=4.50 with an associated observed significance level of p-value <.001. Interpret this result.
What assumptions, if any, are necessary for the validity of the inference, part b?
REDLIT 7.51 Impact of red light cameras on car crashes. To combat red-light-running crashes, many states are adopting photo-red enforcement programs. In these programs, red light cameras installed at dangerous intersections photograph the license plates of vehicles that run the red light. How effective are photo-red enforcement programs in reducing red-light-running crash incidents at intersections? The Virginia Department of Transportation (VDOT) conducted a comprehensive study of its newly adopted photo-red enforcement program and published the results in a June 2007 report. In one portion of the study, the VDOT provided crash data both before and after installation of red light cameras at several intersections. The data (measured as the number of crashes caused by red light running per intersection per year) for 13 intersections in Fairfax County, Virginia, are given in the table. Analyze the data for the VDOT. What do you conclude?
Intersection | Before Camera | After Camera |
---|---|---|
1 | 3.60 | 1.36 |
2 | 0.27 | 0 |
3 | 0.29 | 0 |
4 | 4.55 | 1.79 |
5 | 2.60 | 2.04 |
6 | 2.29 | 3.14 |
7 | 2.40 | 2.72 |
8 | 0.73 | 0.24 |
9 | 3.15 | 1.57 |
10 | 3.21 | 0.43 |
11 | 0.88 | 0.28 |
12 | 1.35 | 1.09 |
13 | 7.35 | 4.92 |
Based on Virginia Transportation Research Council, “Research report: The impact of red light cameras (photo-red enforcement) on crashes in Virginia.” June 2007.
WINE40 7.52 Alcoholic fermentation in wines. Determining alcoholic fermentation in wine is critical to the wine-making process. Must/wine density is a good indicator of the fermentation point, since the density value decreases as sugars are converted into alcohol. For decades, winemakers have measured must/wine density with a hydrometer. Although accurate, the hydrometer employs a manual process that is very time consuming. Consequently, large wineries are searching for more rapid measures of density measurement. An alternative method utilizes the hydrostatic balance instrument (similar to the hydrometer, but digital). A winery in Portugal collected must/wine density measurements on white wine samples randomly selected from the fermentation process for a recent harvest. For each sample, the density of the wine at 20°C was measured with both the hydrometer and the hydrostatic balance. The densities for 40 wine samples are saved in the WINE40 file. The first five and last five observations are shown in the accompanying table. The winery will use the alternative method of mea-suring wine density only if it can be demonstrated that the mean difference between the density measurements of the two methods does not exceed .002. Perform the analysis for the winery. Provide the winery with a written report of your conclusions.
Sample | Hydrometer | Hydrostatic |
---|---|---|
1 | 1.08655 | 1.09103 |
2 | 1.00270 | 1.00272 |
3 | 1.01393 | 1.01274 |
4 | 1.09467 | 1.09634 |
5 | 1.10263 | 1.10518 |
⋮ | ⋮ | ⋮ |
36 | 1.08084 | 1.08097 |
37 | 1.09452 | 1.09431 |
38 | 0.99479 | 0.99498 |
39 | 1.00968 | 1.01063 |
40 | 1.00684 | 1.00526 |
Based on Cooperative Cellar of Borba (Adega Cooperativ a de Borba), Portugal.