6.2 Formulating Hypotheses and Setting Up the Rejection Region

In Section 6.1 we learned that the null and alternative hypotheses form the basis for inference using a test of hypothesis. The null and alternative hypotheses may take one of several forms. In the sewer pipe example, we tested the null hypothesis that the population mean strength of the pipe is less than or equal to 2,400 pounds per linear foot against the alternative hypothesis that the mean strength exceeds 2,400—that is, we tested

H0:μ2,400(Pipe does not meet specifications.)Ha:μ>2,400(Pipe meets specifications.)

This is a one-tailed (or one-sided) statistical test because the alternative hypothesis specifies that the population parameter (the population mean μ in this example) is strictly greater than a specified value (2,400 in this example). If the null hypothesis had been H0:μ2,400 and the alternative hypothesis had been Ha:μ<2,400, the test would still be one-sided because the parameter is still specified to be on “one side” of the null hypothesis value. Some statistical investigations seek to show that the population parameter is either larger or smaller than some specified value. Such an alternative hypothesis is called a two-tailed (or two-sided) hypothesis.

While alternative hypotheses are always specified as strict inequalities, such as μ<2,400,μ>2,400, or μ2,400, null hypotheses are usually specified as equalities, such as μ=2,400. Even when the null hypothesis is an inequality, such as μ2,400, we specify H0:μ=2,400, reasoning that if sufficient evidence exists to show that Ha:μ>2,400 is true when tested against H0:μ=2,400, then surely sufficient evidence exists to reject μ<2,400 as well. Therefore, the null hypothesis is specified as the value of μ closest to a one-sided alternative hypothesis and as the only value not specified in a two-tailed alternative hypothesis. The steps for selecting the null and alternative hypotheses are summarized in the following box.

Steps for Selecting the Null and Alternative Hypotheses

  1. Select the alternative hypothesis as that which the sampling experiment is intended to establish. The alternative hypothesis will assume one of three forms:

    1. One-tailed, upper-tailed (e.g.,Ha:μ>2,400)

    2. One-tailed, lower-tailed (e.g.,Ha:μ<2,400)

    3. Two-tailed (e.g.,Ha:μ2,400)

  2. Select the null hypothesis as the status quo, that which will be presumed true unless the sampling experiment conclusively establishes the alternative hypothesis. The null hypothesis will be specified as that parameter value closest to the alternative in one-tailed tests and as the complementary (or only unspecified) value in two-tailed tests.

    (e.g.,H0:μ=2,400)

A one-tailed test of hypothesis is one in which the alternative hypothesis is directional and includes the symbol “ <” or “ >.”

A two-tailed test of hypothesis is one in which the alternative hypothesis does not specify departure from H0 in a particular direction and is written with the symbol “ .”

Example 6.1 Formulating H0 and Ha for a Test of a Population Mean—Quality Control

Problem

  1. A metal lathe is checked periodically by quality control inspectors to determine whether it is producing machine bearings with a mean diameter of .5 inch. If the mean diameter of the bearings is larger or smaller than .5 inch, then the process is out of control and must be adjusted. Formulate the null and alternative hypotheses for a test to determine whether the bearing production process is out of control.

Solution

  1. The hypotheses must be stated in terms of a population parameter. Here, we define μ as the true mean diameter (in inches) of all bearings produced by the metal lathe. If either μ>.5 or μ<.5, then the lathe’s production process is out of control. Because the inspectors want to be able to detect either possibility (indicating that the process is in need of adjustment), these values of μ represent the alternative (or research) hypothesis. Alternatively, because μ=.5 represents an in-control process (the status quo), this represents the null hypothesis. Therefore, we want to conduct the two-tailed test:

    H0:μ=.5(i.e., the process is in control)Ha:μ.5(1i.e., the process is out of control)

Look Back

Here, the alternative hypothesis is not necessarily the hypothesis that the quality control inspectors desire to support. However, they will make adjustments to the metal lathe settings only if there is strong evidence to indicate that the process is out of control. Consequently, μ.5 must be stated as the alternative hypothesis.

Now Work Exercise 6.11a

Example 6.2 Formulating H0 and Ha for a Test of a Population Proportion—Cigarette Advertisements

Problem

  1. Cigarette advertisements are required by federal law to carry the following statement: “Warning: The surgeon general has determined that cigarette smoking is dangerous to your health.” However, this warning is often located in inconspicuous corners of the advertisements and printed in small type. Suppose the Federal Trade Commission (FTC) claims that 80% of cigarette consumers fail to see the warning. A marketer for a large tobacco firm wants to gather evidence to show that the FTC’s claim is too high, i.e., that fewer than 80% of cigarette consumers fail to see the warning. Specify the null and alternative hypotheses for a test of the FTC’s claim.

Solution

  1. The marketer wants to make an inference about p, the true proportion of all cigarette consumers who fail to see the surgeon general’s warning. In particular, the marketer wants to collect data to show that fewer than 80% of cigarette consumers fail to see the warning, i.e., p<.80. Consequently, p<.80 represents the alternative hypothesis and p=.80 (the claim made by the FTC) represents the null hypothesis. That is, the marketer desires the one-tailed (lower-tailed) test:

    H0:p=.80(i.e., the FTCsclaim is true)Ha:p<.80(i.e., the FTCsclaim is false)

Look Back

Whenever a claim is made about the value of a particular population parameter and the researcher wants to test the claim, believing that it is false, the claimed value will represent the null hypothesis.

Now Work Exercise 6.15

The rejection region for a two-tailed test differs from that for a one-tailed test. When we are trying to detect departure from the null hypothesis in either direction, we must establish a rejection region in both tails of the sampling distribution of the test statistic. Figures 6.4a and 6.4b show the one-tailed rejection regions for lower- and upper-tailed tests, respectively. The two-tailed rejection region is illustrated in Figure 6.4c. Note that a rejection region is established in each tail of the sampling distribution for a two-tailed test.

Figure 6.4

Rejection regions corresponding to one- and two-tailed tests

The rejection regions corresponding to typical values selected for α are shown in Table 6.2 for one- and two-tailed tests based on large samples. Note that the smaller α you select, the more evidence (the larger z) you will need before you can reject H0.

Table 6.2 Rejection Regions for Common Values of α, Large n

Alternate View
Alternative Hypotheses
Lower-Tailed Upper-Tailed Two-Tailed
α=.10 z<1.28 z>1.28 z<1.645orz>1.645
α=.05 z<1.645 z>1.645 z<1.96orz>1.96
α=.01 z<2.33 z>2.33 z<2.575orz>2.575

Example 6.3 Setting Up a Hypothesis Test for μ—Mean Drug Response Time

Problem

  1. The effect of drugs and alcohol on the nervous system has been the subject of considerable research. Suppose a research neurologist is testing the effect of a drug on response time by injecting 100 rats with a unit dose of the drug, subjecting each rat to a neurological stimulus, and recording its response time. The neurologist knows that the mean response time for rats not injected with the drug (the “control” mean) is 1.2 seconds. She wishes to test whether the mean response time for drug-injected rats differs from 1.2 seconds. Set up the test of hypothesis for this experiment, using α=.01.

Solution

  1. The key word mean in the statement of the problem implies that the target parameter is μ, the mean response time for all drug-injected rats. Since the neurologist wishes to detect whether μ differs from the control mean of 1.2 seconds in either direction—that is, μ<1.2 or μ>1.2—we conduct a two-tailed statistical test. Following the procedure for selecting the null and alternative hypotheses, we specify as the alternative hypothesis that the mean differs from 1.2 seconds, since determining whether the drug-injected mean differs from the control mean is the purpose of the experiment. The null hypothesis is the presumption that drug-injected rats have the same mean response time as control rats unless the research indicates otherwise. Thus,

    H0:μ=1.2(Mean response time is 1.2 seconds)Ha:μ1.2(Mean response time is less than 1.2 or greater than 1.2 seconds)

    The test statistic measures the number of standard deviations between the observed value of x and the null-hypothesized value μ=1.2:

    Test statistic:z=x¯1.2σx¯

    The rejection region must be designated to detect a departure from μ=1.2 in either direction, so we will reject H0 for values of z that are either too small (negative) or too large (positive). To determine the precise values of z that constitute the rejection region, we first select α, the probability that the test will lead to incorrect rejection of the null hypothesis. Then we divide α equally between the lower and upper tail of the distribution of z, as shown in Figure 6.5. In this example, α=.01, so α/2=.005 is placed in each tail. The areas in the tails correspond to z=2.575 and z=2.575, respectively (from Table 6.2), so

    Rejection region:z<2.575orz>2.575
    (see Figure 6.5)

    Figure 6.5

    Two-tailed rejection region: α=.01

    Assumptions: Since the sample size of the experiment is large enough (n>30), the Central Limit Theorem will apply, and no assumptions need be made about the population of response time measurements. The sampling distribution of the sample mean response of 100 rats will be approximately normal, regardless of the distribution of the individual rats’ response times.

Look Back

Note that the test is set up before the sampling experiment is conducted. The data are not used to develop the test. Evidently, the neurologist wants to conclude that the mean response time for the drug-injected rats differs from the control mean only when the evidence is very convincing because the value of α has been set quite low at .01. If the experiment results in the rejection of H0, she can confidently conclude that the mean response time of the drug-injected rats differs from the control mean because there is only a .01 probability of a Type I error.

Now Work Exercise 6.13b, c

Once the test is set up, the neurologist of Example 6.3 is ready to perform the sampling experiment and conduct the test. The hypothesis test is performed in Section 6.4.

Statistics in Action Revisited

Identifying the Key Elements of a Hypothesis Test Relevant to the KLEENEX® Survey

In Kimberly-Clark Corporation’s survey of people with colds, each of 250 customers was asked to keep count of his or her use of KLEENEX® tissues in diaries. One goal of the company was to determine how many tissues to package in a cold-care (now “anti-viral”) box of KLEENEX®; consequently, the total number of tissues used was recorded for each person surveyed. Since number of tissues is a quantitative variable, the parameter of interest is either μ, the mean number of tissues used by all customers with colds, or σ2, the variance of the number of tissues used.

Recall that recently, the company increased from 60 to 68 the number of tissues it packages in a cold-care box of KLEENEX® tissues. This decision was based on a claim made by marketing experts that the average number of times a person will blow his or her nose during a cold exceeds the previous mean of 60. The key word average implies that the target parameter is μ, and the marketers are claiming that μ>60. In order to test the claim, we set up the following null and alternative hypotheses:

H0:μ=60Ha:μ>60

We’ll conduct this test in the next Statistics in Action Revisited on p. 327.

Exercises 6.1–6.21

Understanding the Principles

  1. 6.1 Which hypothesis, the null or the alternative, is the status quo hypothesis? Which is the research hypothesis?

  2. 6.2 Which element of a test of hypothesis is used to decide whether to reject the null hypothesis in favor of the alternative hypothesis?

  3. 6.3 What is the level of significance of a test of hypothesis?

  4. 6.4 What is the difference between Type I and Type II errors in hypothesis testing? How do α and β relate to Type I and Type II errors?

  5. 6.5 List the four possible results of the combinations of decisions and true states of nature for a test of hypothesis.

  6. 6.6 We (generally) reject the null hypothesis when the test statistic falls into the rejection region, but we do not accept the null hypothesis when the test statistic does not fall into the rejection region. Why?

  7. 6.7 If you test a hypothesis and reject the null hypothesis in favor of the alternative hypothesis, does your test prove that the alternative hypothesis is correct? Explain.

Learning the Mechanics

  1. 6.8 Consider a test of H0:μ=4. In each of the following cases, give the rejection region for the test in terms of the z-statistic:

    1. Ha:μ>4,α=.05

    2. Ha:μ>4,α=.10

    3. Ha:μ<4,α=.05

    4. Ha:μ4,α=.05

  2. 6.9 For each of the following rejection regions, sketch the sampling distribution for z and indicate the location of the rejection region.

    1. z>1.96

    2. z>1.645

    3. z>2.575

    4. z<1.28

    5. z<1.645 or z>1.645

    6. z<2.575 or z>2.575

    7. For each of the rejection regions specified in parts a–f, what is the probability that a Type I error will be made?

Applet Exercise 6.1

Use the applet entitled Hypotheses Test for a Mean to investigate the frequency of Type I and Type II errors. For this exercise, use n=100 and the normal distribution with mean 50 and standard deviation 10.

  1. Set the null mean equal to 50 and the alternative to not equal. Run the applet one time. How many times was the null hypothesis rejected at level .05? In this case, the null hypothesis is true. Which type of error occurred each time the true null hypothesis was rejected? What is the probability of rejecting a true null hypothesis at level .05? How does the proportion of times the null hypothesis was rejected compare with this probability?

  2. Clear the applet, then set the null mean equal to 47, and keep the alternative at not equal. Run the applet one time. How many times was the null hypothesis not rejected at level .05? In this case, the null hypothesis is false. Which type of error occurred each time the null hypothesis was not rejected? Run the applet several more times without clearing. Based on your results, what can you conclude about the probability of failing to reject the null hypothesis for the given conditions?

Applying the Concepts—Basic

  1. 6.10 Walking to improve health. In a study investigating a link between walking and improved health (Social Science & Medicine, Apr. 2014), researchers reported that adults walked an average of 5.5 days in the past month for the purpose of health or recreation. Specify the null and alternative hypotheses for testing whether the true average number of days in the past month that adults walked for the purpose of health or recreation is lower than 5.5 days.

  2. 6.11 Americans’ favorite sport. The Harris Poll (Dec. 2013) conducted an online survey of American adults to determine their favorite sport. Your friend believes professional (NFL) football is the favorite sport for 40% of American adults. Specify the null and alternative hypotheses for testing this belief. Be sure to identify the parameter of interest.

  3. 6.12 Infants’ listening time. Researchers writing in Analysis of Verbal Behavior (Dec. 2007) reported that the mean listening time of 16-month-old infants exposed to nonmeaningful monosyllabic words (e.g., “giff,” “cham,” “gack”) is 8 seconds. Set up the null and alternative hypotheses for testing the claim.

  4. 6.13 Play Golf America program. In the Play Golf America program, teaching professionals at participating golf clubs provide a free 10-minute lesson to new customers. According to the Professional Golf Association (PGA), golf facilities that participate in the program gain, on average, $2,400 in green fees, lessons, or equipment expenditures. A teaching professional at a golf club believes that the average gain in green fees, lessons, or equipment expenditures for participating golf facilities exceeds $2,400.

    1. In order to support the claim made by the teaching professional, what null and alternative hypotheses should you test?

    2. Suppose you select α=.05. Interpret this value in the words of the problem.

    3. For α=.05, specify the rejection region of a large-­sample test.

  5. 6.14 Effectiveness of online courses. The Sloan Survey of Online Learning, “Going the Distance: Online Education in the United States, 2011,” reported that 68% of college presidents believe that their online education courses are as good as or superior to courses that utilize traditional face-to-face instruction.

    1. Give the null hypothesis for testing the claim made by the Sloan Survey.

    2. Give the rejection region for a two-tailed test using α=.01.

  6. 6.15 DNA-reading tool for quick identification of species. A biologist and a zoologist at the University of Florida were the first scientists to test the effectiveness of a high-tech handheld device designed to instantly identify the DNA of an animal species (PLOS Biology, Dec. 2005). They used the DNA-reading device on tissue samples collected from mollusks with brightly colored shells. The scientists discovered that the error rate of the device is less than 5 percent. Set up the null and alternative hypotheses as if you want to support the findings.

Applying the Concepts—Intermediate

  1. 6.16 Calories in school lunches. The U.S. Department of Agriculture’s National School Lunch Program mandates that for a high school to receive reimbursement for school lunches, the number of calories served at lunch must be no more than 850 calories. Suppose a nutritionist believes that the true mean number of calories served at lunch at all U.S. high schools exceeds 850 calories.

    1. Identify the parameter of interest.

    2. Specify the null and alternative hypotheses for testing this claim.

    3. Describe a Type I error in the words of the problem.

    4. Describe a Type II error in the words of the problem.

  2. 6.17 A border protection avatar. The National Center for Border Security and Protection has developed the “Embodied Avatar”—a kiosk with a computer-animated border guard that uses artificial intelligence to scan passports, check fingerprints, read eye pupils, and ask questions of travelers crossing the U.S. border (National Defense Magazine, Feb. 2014). Based on field tests, the avatar’s developer claims that the avatar can detect deceitful speech correctly 75% of the time.

    1. Identify the parameter of interest.

    2. Give the null and alternative hypotheses for testing the claim made by the avatar’s developer.

    3. Describe a Type I error in the words of the problem.

    4. Describe a Type II error in the words of the problem.

  3. 6.18 Virtual reality hypnosis for pain. The International Journal of Clinical and Experimental Hypnosis (Vol. 58, 2010) investigated using virtual reality hypnosis (VRH) to reduce the pain of trauma patients. Patients reported their pain intensity on the Graphic Rating Scale (GRS) both prior to and one hour after a VRH session. The researchers reported that hypnosis reduced the pain intensity of trauma patients by an average of μ=10 points on the GRS. Suppose you want to test whether trauma patients who receive normal analgesic care (i.e., medication, but no hypnosis) will have a smaller reduction in average pain intensity than the VRH trauma patients.

    1. Set up H0 and Ha for the test.

    2. Describe a Type I error for this test.

    3. Describe a Type II error for this test.

  4. 6.19 Mercury levels in wading birds. According to a University of Florida wildlife ecology and conservation researcher, the average level of mercury uptake in wading birds in the Everglades has declined over the past several years (UF News, July 15, 2004). Ten years ago, the average level was 15 parts per million.

    1. Give the null and alternative hypotheses for testing whether the average level today is less than 15 ppm.

    2. Describe a Type I error for this test.

    3. Describe a Type II error for this test.

Applying the Concepts—Advanced

  1. 6.20 Jury trial outcomes. Sometimes, the outcome of a jury trial defies the “commonsense” expectations of the general public (e.g., the 1995 O. J. Simpson verdict and the 2011 Casey Anthony verdict). Such a verdict is more acceptable if we understand that the jury trial of an accused murderer is analogous to the statistical hypothesis-testing process. The null hypothesis in a jury trial is that the accused is innocent. (The status quo hypothesis in the U.S. system of justice is innocence, which is assumed to be true until proven beyond a reasonable doubt.) The alternative hypothesis is guilt, which is accepted only when sufficient evidence exists to establish its truth. If the vote of the jury is unanimous in favor of guilt, the null hypothesis of innocence is rejected and the court concludes that the accused murderer is guilty. Any vote other than a unanimous one for guilt results in a “not guilty” verdict. The court never accepts the null hypothesis; that is, the court never declares the accused “innocent.” A “not guilty” verdict (as in the O. J. Simpson case) implies that the court could not find the defendant guilty beyond a reasonable doubt.

    1. Define Type I and Type II errors in a murder trial.

    2. Which of the two errors is the more serious? Explain.

    3. The court does not, in general, know the values of α and β, but ideally, both should be small. One of these probabilities is assumed to be smaller than the other in a jury trial. Which one, and why?

    4. The court system relies on the belief that the value of α is made very small by requiring a unanimous vote before guilt is concluded. Explain why this is so.

    5. For a jury prejudiced against a guilty verdict as the trial begins, will the value of α increase or decrease? Explain.

    6. For a jury prejudiced against a guilty verdict as the trial begins, will the value of β increase or decrease? Explain.

  2. 6.21 Intrusion detection systems. Refer to Consider the Journal of Research of the National Institute of Standards and Technology (Nov.–Dec. 2003) study of a computer intrusion detection system (IDS), presented in Exercise 3.98 (p. 158). Recall that an An IDS is designed to provide an alarm whenever unauthorized access (e.g., an intrusion) to a computer system occurs. The probability of the system giving a false alarm (i.e., providing a warning when, in fact, no intrusion occurs) is defined by the symbol α, while the probability of a missed detection (i.e., no warning given when, in fact, an intrusion occurs) is defined by the symbol β. These symbols are used to represent Type I and Type II error rates, respectively, in a hypothesis-testing scenario.

    1. What is the null hypothesis H0?

    2. What is the alternative hypothesis Ha?

    3. According to actual data on the EMERALD system collected by the Massachusetts Institute of Technology Lincoln Laboratory, only 1 in 1,000 computer sessions with no intrusions resulted in a false alarm. For the same system, the laboratory found that only 500 of 1,000 intrusions were actually detected. Use this information to estimate the values of α and β.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset