Error is inevitable and occurs in any survey. If you need perfection, don’t bother doing research. What is important is to identify the possible sources of error and then try to minimize these errors.
Herbert Weisberg defines error as the “difference between an obtained value and the true value.”1 Typically we don’t know what the true value is, but that doesn’t change our definition of error. When we discussed sampling in the previous chapter, it was the population value. When we focus on measurement, it is the true or actual value of whatever is being measured. Error is the difference between that true value and whatever the obtained or observed value turns out to be. It’s important to keep in mind that error can occur at any point in the process of doing a survey from the initial design of the survey through the writing of the report.
Weisberg points out that error can be random or systematic.2 For example, when we select a sample from a population, there will be sampling error. No sample is a perfect representation of the population. Assuming we are using probability sampling, this error will be random. However, sometimes some elements in the population are systematically left out of the sample. For example, if we are doing a phone survey and rely exclusively on landlines that could produce a systematic error because we have left out cell-phone-only households. Systematic error is often referred to as bias. We need to be aware of both random and systematic error.
There are many types of error that can occur. In the past, the focus was on sampling error and nonresponse error, which occurred as a result of refusals or the inability to contact respondents. Instead of focusing on just a couple of types of error, we should focus on all possible types of survey error. This is often referred to as total survey error.3 Paul Biemer defines total survey error as the “accumulation of all errors that may arise in the design, collection, processing and analysis of survey data.”4
There are various ways of categorizing the different types of survey error. Typically we consider the following types of error5:
Weisberg also discusses survey administration issues such as the following:
To this we would add error that occurs in the reporting of survey data.
We’re going to look at each of these types of error, discuss some of the research findings about each type, and talk about how you can try to minimize error.
Sampling Error
Sampling error is one of the issues in sample design and occurs whenever you select a sample from a population. No sample is ever a perfect picture of the population. Let’s say that your population is all households in the city in which you live. You select a sample of 500 households from this population.i You’re interested in the proportion of households that recycle such things as cans, bottles, and other recyclable materials. It turns out that 45 percent of the sample recycles. That doesn’t mean that 45 percent of the population recycles. Why? Sampling always carries with it some amount of sampling error. It’s inevitable.
Here’s another way to understand sampling error. We can use sample data to estimate population values. If you were to select repeated random samples of the same size from the same population, your sample estimates would vary from sample to sample. If you think about it, this makes sense. Each sample will contain a different set of households. So why would you expect all the samples to give you the same estimate of the households that recycle?
One of the advantages of probability sampling is that you can estimate the amount of sampling error there will be from sample to sample. Assuming that you used probability sampling to get your sample and that you properly selected your sample, the resulting sampling error will be random. And to make things even better, there are things you can do to reduce sampling error.
Minimizing Sampling Error
Here are two ways you can reduce sampling error.
Coverage Error
Earl Babbie distinguishes between the population and the study population. The population is the “theoretically specified aggregation of study elements,” while the study population is the “aggregation of elements from which a sample is actually selected.”8 In other words, the population you want to make statements about can be different from the study population from which you draw your sample. The sampling frame is the actual list from which the sample is selected.9
Coverage error occurs when the sampling frame does not match the population. In other words, sometimes the list from which the sample is selected does not match the population, and this produces coverage error. For example, some elements in the population may have been left out of the list from which the sample is selected.ii Let’s look at some examples.
You realize this is going to produce coverage error. The best solution is to work with each church in your sample to delete members who are no longer there, add in the new members, and take out the duplicates. It takes some work, but it’s worth it because it reduces coverage error.
Minimizing Coverage Error
So how do we try to reduce coverage error? First, we have to ask the right questions. Don Dillman et al. suggest that there are certain questions that we ought to always consider.15
So the general strategy for dealing with coverage error is to first identify the sources of error. Once we know what the problems are, then we can try to reduce them keeping in mind that eliminating all coverage error is probably not possible. This can be done in several ways.
Nonresponse Error
Ideally, we want everyone in our sample to complete the survey, but we know that probably it isn’t possible for two reasons.
Theories of Survey Participation
It helps to think about the different reasons that people might agree or refuse to be interviewed. These are often referred to as theories of participation.
In other words, different things are important to different people. Some place importance on the length of the survey, while others focus more on the topic or incentives. Groves refers to this as leverage. Researchers place emphasis on different aspects of the survey. Some emphasize the topic, while others focus on the length particularly if it is a short survey. Groves refers to this as salience. This approach suggests that we should try to understand what is important to our respondents and emphasize those aspects of the survey. It also suggests that we ought not to focus on only one aspect of the survey when contacting respondents, but we should focus on different aspects which might be important to different respondents.
Nonresponse and Nonresponse Bias
It’s clear that nonresponse has been increasing and that this is a critical problem for surveys. Roger Tourangeau and Thomas Plewes looked at a number of large national surveys conducted in the United States and concluded that “nonresponse rates continue to increase in all types of cross-sectional surveys, with little to suggest that that the trend has plateaued.”21 They also examined the different ways of calculating response rate as suggested by AAPOR.22
Edith de Leeuw et al. focus on the difference in response rates for different modes of survey delivery. They concluded that
in general, face-to-face surveys tend to obtain higher response rates than comparable telephone surveys, and mail surveys tend to have a lower response rate than comparable face-to-face and in lesser degree to telephone surveys. In addition, the response rates for both telephone and face-to-face surveys are declining, although such a trend is not as evident for mail surveys.23
But why is nonresponse a critical problem for surveys? One reason is that nonresponse has become sizable, and this can increase the risk of nonresponse bias. The other reason is that people who don’t respond to surveys are often systematically different from those who do respond, and this has the potential for creating bias in our survey data. If the difference between those who respond and those who don’t respond is related to what the survey is about, then bias will occur. However, nonresponse does not necessarily lead to bias when “nonrespondents are missing at random.”24
Let’s consider some examples of nonresponses bias. Andy Peytchev et al. looked at self-reports of abortion and concluded that “those with a lower likelihood to participate in the survey were also more likely to underreport such experiences.”25 Many researchers have observed that voting in elections tends to be overreported. Roger Tourangeau et al. note that “nonvoters were both less likely to take part in the survey and more likely to misreport if they did take part.”26 In both these examples, those who took part in the survey were different from those who did not take part, and this difference was related to the focus of the survey.
Another example of nonresponse bias is described in Thomas Holmes and James Schmitz’s analysis of the “Characteristics of Business Owners Survey.” Holmes and Schmitz focus on estimating “the probability that an individual discontinues ownership of his or her business.”27 Data are based on a sample of tax returns from 1982. The survey was mailed to respondents in 1986. We would expect that those who still owned their business in 1986 would be more likely to return the survey than those who did not currently own their business. Since we want to estimate the probability that a person has “terminated his or her ownership share over the 1982–1986 period,”28 we would expect that the data would underestimate the probability of termination, and, in fact, Holmes and Schmitz’s analysis shows this to be the case.
Increasing Response
If nonresponse bias is a problem, then what can we do about it? Increasing response is not a guarantee of low bias, but a high nonresponse rate raises the possibility of nonresponse bias. Let’s look at some ways in which we can increase response.
Groves et al. suggest that there are five factors that affect survey participation.29
We can’t do much about some of these factors. For example, we can’t do much about the increase in surveys in our society and the fact that some people may have recently been asked to do a survey. We can’t do much about the growing trend for people to express doubts about the worth of surveys. But we can do something about the survey itself. Dillman has written extensively about reducing the burden on respondents.30 This is a logical consequence of social exchange theory. If we can reduce the costs of doing surveys, then we will increase the likelihood of people to respond. We can make the survey as easy to take as possible. We can create a survey that flows naturally from question to question. We can avoid asking unnecessary questions, which will reduce the length of the survey.
There are psychological principles that we can use to try to increase survey participation. When we ask someone to agree to be interviewed, we hope that the person will comply with our request. Robert Cialdini suggests that there are certain rules of behavior that can be used to increase compliance.31 Here are some of these rules as summarized by Groves et al.
There is considerable evidence that offering a prepaid cash incentive increases the likelihood of a person responding to the survey.35 You have probably received a request for donations from nonprofits or political candidates. Often the request comes with a small gift such as a pencil, a key chain, or some other small gift. This is similar to the prepaid cash incentive. Incentives given before the individual responds to the survey have been shown to be more effective than postpaid incentives in increasing survey participation36 Offering the person the opportunity to be entered into a lucky draw for a large gift, such as a computer tablet or cash, does not appear to be as effective.
One of the most effective ways of increasing participation is multiple follow-ups. Dillman, talking about mailed surveys, says that “multiple contacts are essential for maximizing response.”37 The same thing can be said for any type of survey—face-to-face, mailed, phone, and web surveys. In face-to-face and telephone surveys, multiple contacts can add considerably to your cost, but they are essential for increasing response rate.
Measurement Error
Measurement error is the difference between the true value of some variable and the value that the respondent gives you. A simple example is measuring age. Often, we ask the respondent how old the person was on his or her last birthday. But if you are young and you order an alcoholic drink in a bar, the bartender will ask you for proof of age. The age given to the bartender could easily be an overestimate of your age. This would be an example of “measurement error due to the respondent.”38 In other words, respondents might not be giving you an accurate answer because of their self-interest in appearing older. Weisberg contrasts this with “measurement error due to the interviewer.”39 We know that the interviewer’s gender, race, and age can influence how respondents answer our questions. We’re going to talk about both types of measurement error.
We’ll start by discussing error that occurs as a result of question wording and question order. It’s important to understand that measurement error, like all types of error, cannot be eliminated. But it can be minimized. Minimizing error is only possible if you are, first, aware of the ways in which error can occur and, second, take steps to minimize it.
Measurement Error Associated with Question Wording
Measurement error can occur as a result of question wording. One of the classic examples is found in Howard Schuman’s and Stanley Presser’s discussion of the difference between “forbidding” and “not allowing” certain types of behavior such as “making public speeches against democracy.” They conclude that “Americans are much more willing to not allow speeches than they are to forbid them, although the two actions appear to be logically equivalent.”40 Numerous studies have replicated this finding. However, Schuman notes that regardless of which wording is used, there is a clear trend over time toward not forbidding or allowing such speeches. Thus, even with questions like the forbid and not allow questions, you can still track changes over time.41
Barbara Bickart et al. studied the accuracy of reports of other people’s behavior. She asked couples to “search for information about a vacation they could win.”42 Then they discussed and actually planned the vacation. Afterward they “were asked to either count or estimate the number of accommodations, restaurants, and activities that they/their partner examined during the information search task.”43 Their analysis showed that questions asking for counts were more accurate than questions asking for estimations.
Still another example is found in a series of questions in the GSS conducted by the National Opinion Research Center. The GSS asks a series of questions about whether the United States should be spending more money, less money, or about the same amount on such things as welfare. They conducted an experiment by randomly asking one-half of the respondents about “welfare,” while the other random half was asked about “assistance to the poor.” Tom Smith analyzed GSS data and concluded that “‘welfare’ typically produces much more negative evaluations than ‘the poor.’”44 Gregory Huber and Celia Paris point out that this assumes that these two terms are equivalent to the respondent. Their research suggests that this is not the case. They conclude that “respondents are twice as likely . . . to believe that programs like soup kitchens, homeless shelters, and food banks are ATP [assistance to the poor] as opposed to welfare.”45 In other words, the questions are not equivalent because the words “welfare” and “assistance to the poor” bring to mind different things. Huber and Paris’s findings point out that we shouldn’t be too quick to conclude that question wording is behind the different responses but that we need to look below the surface and consider how respondents interpret different question wording.
Another example is questions that ask for opinions about global warming and climate change. Jonathon Schuldt et al. found that “Republicans were less likely to endorse that the phenomenon is real when it was referred to as ‘global warming’ . . . rather than ‘climate change’ . . . whereas Democrats were unaffected by question wording.”46 They point out that the difference between Republicans and Democrats is much greater when the question is framed in terms of global warming. Lorraine Whitmarsh looked at what respondents think these terms mean and discovered that global warming is more likely to be associated with human causes than climate change.47 Again this suggests that respondents attach different meaning to these terms. Thus, it becomes critical how the question is worded when making comparisons between Republicans and Democrats.
Questions are often asked about people’s attitudes toward abortion. Sometimes a single question is used for asking respondents to indicate their attitude toward abortion in general. For example, do you think abortion should be legal or not? However, the GSS includes a series of seven questions, asking whether people think abortion should be legal in various scenarios—in the case of rape, in the case of a serious defect to the baby, in the case of a woman who has low income and can’t afford more children, and other such situations. The data show that people are much more likely to feel abortion should be legal in the case of rape or a serious defect to the baby than they are in the case of low-income women who can’t afford more children. Howard Schuman offers the advice of asking “several different questions about any important issue.”48 The abortion example illustrates this point.
Eleanor Singer and Mick Couper analyzed a series of questions from the GSS that shows that changes in question wording do not necessarily affect respondents.
At intervals since 1990, the General Social Survey (GSS) has asked a series of four questions inquiring into knowledge of genetic testing and attitudes toward prenatal testing and abortion, most recently in 2010. The questions about prenatal testing and abortion were framed in terms of “baby”. But in the current anti-abortion climate, it seemed possible that the word “fetus” would carry more abstract, impersonal connotations than “baby” and might therefore lead to different responses, especially in the case of abortion. To resolve this issue, we designed the question-wording experiment reported in this research note. We found no significant differences by question wording for abortion preferences in the sample as a whole and small but significant differences for prenatal testing, in a direction opposite to that expected. However, question wording did make substantial differences in the responses of some demographic subgroups.49
Still another example is found in asking about voting. You would think that whether you voted or who you voted for is pretty straightforward, but here again, question wording makes a difference. Janet Box-Steffensmeier et al. report on a change that was made in the American National Election Study’s (NES) question on whether and how one voted in House of Representatives contests. Prior to 1978, there was little difference between the actual House vote and the vote reported in the NES. Since 1978 the NES has reported a much higher vote for the incumbent than the actual vote. Box-Steffensmeier suggests that the following changes in question wording might account for this finding.
Box-Steffensmeier concludes that “the ballot format evidently exaggerates the incumbent’s support because people are far more likely to recognize . . . the incumbent’s name than . . . the challenger’s name.”52 This study also showed that you can reduce the proincumbent bias by making the candidates’ party stand out by bolding and italicizing it and using a different font. This reduced but did not eliminate the bias.53
Measurement Error Associated with Question Order
It’s clear that question wording can affect what people tell us. Question order also makes a difference. Think about a survey in your community that deals with quality of life. One of the questions you might ask is “What is the most pressing problem facing your community today?” You might also want to ask more specific questions about crime, the public schools, and jobs. Would the order of the questions make a difference? If you asked about crime first, then respondents would probably be more likely to mention crime as one of the most pressing problems. Order matters.
David Moore provides us with some interesting examples of order effects using data from a Gallup Poll that was conducted in 1997. The question was “Do you generally think that [Bill Clinton/Al Gore] is honest and trustworthy?”54 A random half of the respondents was asked the question with Clinton’s name first, and the other random half was asked with Gore’s name first. The data show
that when respondents (half the sample) were asked about Clinton first, 50 percent said he was honest and trustworthy; when the other half of the sample was asked about Gore first, 68% said the vice president was honest and trustworthy.55
In other words, Gore was considered honest and trustworthy by 18 percentage points more than Clinton. But when Moore took into account the order of the questions, he found that when Clinton’s name appeared second, 57 percent said he was honest and trustworthy, and when Gore’s name appeared second, 60 percent saw him as honest and trustworthy. The 18 percentage point difference is reduced to three percentage points. He concludes that “this is a classic case of people trying to make their ratings of the two men somewhat consistent” and he refers to this as a consistency effect.56
On the same poll, respondents were given the following question: “I’m going to read some personal characteristics and qualities. As I read each one, please tell me whether you think it applies to [Newt Gingrich/Bob Dole] . . . Honest and trustworthy.”57 Again the order of the names was randomly assigned with half the respondents receiving Gingrich’s name first and the other half given Dole’s name first. Dole was considered more honest and trustworthy by 19 percentage points when Gingrich’s and Dole’s names appeared first but that increased to 31 percentage points when their names appeared second. Moore calls this a contrast effect because the data show that when “when people think of Dole and Gingrich, they tend to emphasize the differences between the two men rather than the similarities.”58 This is not to say that the order of the questions always affects what people tell us. But we should be aware of this possibility. The examples provided by Moore show us how this might occur.59
Measurement Error Associated with Respondents’ Characteristics
Satisficing
Answering questions often requires a lot of effort on the part of respondents. Charles Cannell et al. suggest that respondents go through a process in trying to answer questions that looks like the following.60
In order to reduce the amount of effort required to answer survey questions, respondents sometimes look for ways to reduce this burden. This is called satisficing. Krosnick defines satisficing as “giving minimally acceptable answers, rather than optimal answers”61 and can take various forms including:
For example, let’s think about the quality-of-life survey that we mentioned earlier that asks, “What is the most pressing problem facing your community today?” Some respondents might give you a one-word answer such as crime or education or jobs. This doesn’t really tell us much about what respondents are thinking. Other respondents might say they don’t know or that they have no opinion. Such answers reduce the workload of respondents.
Some survey questions give respondents a list of possible response categories from which they are asked to select their answer. Sometimes they are limited to one choice, while other times, they may select multiple responses. Marta Galesic et al. used eye-tracking information for a web survey to show that respondents often spend “more time looking at the first few options in a list of response options than those at the end of the list.”63 They also found that “the eye-tracking data reveal that respondents are reluctant to invest effort in reading definitions of survey concepts that are only a mouse click away or paying attention to initially hidden response options.”64 These are also examples of satisficing.
Another interesting example of satisficing is often referred to as “heaping.”65 Often when respondents are asked to respond with a number or a count, they respond with a rounded value. For example, when asked about the years in which events occurred, responses were more likely to be in multiples of five.
Jon Krosnick suggests that
the likelihood that a given respondent will satisfice . . . is a function of three factors: the first is the inherent difficulty of the task that the respondent confronts; the second is the respondent’s ability to perform the required task; and the third is the respondent’s motivation to perform the task.66
Other researchers have suggested that satisficing occurs more frequently in certain types of surveys. Heerwegh and Loosveldt found that satisficing occurred more frequently in web surveys than in face-to-face surveys,67 and Holbrook et al. discovered that satisficing occurred more often in telephone surveys than in face-to-face surveys.68 Krosnick et al. also found that some respondents are more likely to satisfice than other respondents. For example, low-education respondents were more likely to say that they had no opinion than those with more education.69
Social Desirability
Some types of behavior or attitudes are viewed as more socially desirable than others. For example, voting is often seen as a responsibility of citizens and as a socially desirable action. On the other hand, cheating on exams is typically viewed as socially undesirable. There is considerable evidence that respondents tend to overreport socially desirable behaviors and attitudes and underreport those that are socially undesirable.
Brian Duff et al. compared the actual voting turnout in the 2000 and 2002 elections with the turnout reported in the 2000 and 2002 American NES. In 2000, reported turnout exceeded actual turnout by 17.7 percentage points, and in the 2002 election, by 16.6 percentage points.70
Matthew Streb et al. looked at a different question—whether people would vote for a woman for president if they thought she was qualified. Public opinion data show that the percent of people who say they would vote for a woman increased from slightly over 30 percent in 1945 to slightly over 90 percent in 2005.71 Clearly the norms of equality and fairness suggest that one ought to be willing to vote for a woman who is qualified. Some people might be giving this answer because they see it as the socially desirable response.
Frauke Kreuter et al. looked at reports of socially desirable and undesirable behaviors in a survey of university alumni. The types of behavior included dropping a class, receiving a D or F, receiving academic honors, belonging to the Alumni Association, and donating money to the university. Clearly receiving a D or F would be socially undesirable. Using university records, Kreuter found that approximately 61 percent of the respondents who answered this question had received such a grade. Of these respondents, approximately 27 percent failed to report receiving that grade.72 Kreuter also found that underreporting of the socially undesirable response was less in web surveys than in telephone surveys.
Roger Tourangeau and Tinn Yang suggest that “misreporting about sensitive topics is quite common and . . . it is largely situational.”73 While research supports this claim, interestedly, Patrick Egan and Jeffrey Lax et al. found no evidence of a social desirability effect when it came to support for same-sex marriage.74
Measurement Error Associated with the Interviewer
Characteristics of the interviewer could refer to physical characteristics such as race, gender, and age or to characteristics such as perceived friendliness. These characteristics can affect what respondents tell us. They can interact with respondent characteristics to produce different effects for males and females or for blacks and whites or for other categories of respondents. We’re going to focus on two characteristics of interviewers that have been shown to affect what people tell us—race and gender.
Race of the Interviewer
Two classic studies dealt with questions about race in surveys conducted in Detroit in 1968 and 1971. Howard Schuman and Jean Converse showed that blacks appeared more militant and expressed more hostility toward whites when interviewed by blacks than when interviewed by whites.75 Shirley Hatchett and Schuman found that whites gave more “liberal or pro-Black opinions when the interviewer is Black.”76 Both of these studies interviewed respondents face-to-face where the race of both the interviewer and the respondent was generally apparent.
Other studies focused on voting. Barbara Anderson et al. used five election surveys ranging in time from 1964 to 1984. Their data showed the following.77
Black nonvoters . . . who lived in predominately Black neighborhoods and were interviewed by Black interviewers were more likely to report falsely that they voted than Black respondents interviewed by White interviewers. Black respondents in Black neighborhoods who were interviewed by Black interviewers were also more likely actually to vote . . . than Blacks interviewed by Whites.
Steven Finkel et al. used a 1989 survey in Virginia that looked at voting in a gubernatorial election in which Douglas Wilder, who was black, ran against Marshall Coleman, who was white. Finkel found that “whites are 8–11 percentage points more likely to voice support for the Black candidate to Blacks than to Whites.”78
Darren Davis and Brian Silver focused on political knowledge in a telephone survey of adults in Michigan. He considered both the actual race of the interviewer and the race perceived by the respondent. For whites, neither the actual race nor the perceived race of the interviewer was related to political knowledge. However, “when Black respondents identify the test-giver as Black, they do much better on the test than when they identify the test-giver as White or when the race of the interviewer is ambiguous.”79 This study is important because it explicitly measured the perceived race of the interviewer and showed perceived race to be an important variable. It also showed that race can be an important factor for some respondents but not for other respondents.
Gender of the Interviewer
Research has also shown that the gender of the interviewer can affect what people tell us. Emily Kane and Laura Macaulay analyzed data from a national sample of households and found that “male respondents offer significantly different responses to male and female interviewers on questions dealing with gender inequality in employment.”80 Men voiced more equalitarian views to female interviewers than to male interviewers.
Other studies focused on health-related information. Timothy Johnson and Jennifer Parsons reported that the homeless (both male and female) are more likely to report substance abuse to male interviewers than to female interviewers.81 However, Melvin Pollner found that both male and female respondents were more likely to report substance abuse to female interviewers than to male interviewers, suggesting that gender affects respondents differently in various settings.82
These studies show that interviewer characteristics such as race and gender can influence what respondents tell us suggesting that we ought to consider the interviewers’ race and gender as variables in our analysis of survey data. They also suggest that interviewers ought to be randomly assigned to respondents rather than trying to match the respondents’ race and gender.83
Recognizing and Minimizing Measurement Error
To get the percent that was angry or upset about a woman as president, all he had to do was to subtract “the average number of items in the baseline condition [the first group] from the average number of items in the test condition [the second group] and . . . [multiply] by 100.”90
Mode Effects
The method or mode of survey delivery might affect what people tell us. This is referred to as mode effects. The four basic modes are face-to-face, telephone, mailed, and web surveys, although there are many variations of these four modes. This isn’t error but simply differences due to the mode of delivery. We’re going to consider several studies that illustrate the nature of mode effects.
Dealing with Mode Effects
Mode effects are not survey error. Rather, they occur because the mode of survey delivery affects respondents in different ways. Telephone surveys represent a different interview environment than face-to-face interviews, and it’s not surprising that this might result in greater satisficing, as found by Holbrook and McDonald and Thornburg. How then should we deal with mode effects?
Postsurvey Error
Error can also occur after the survey data have been collected. Error can occur in the processing of data. If we enter data manually in a spreadsheet or statistical program, there is the possibility of error. If we code open-ended questions such as “what is the most pressing problem facing your community today?” we might make coding errors. The solution here is to check our data entry and our coding to see if there are errors. We can have another person independently code or enter the data and then compare the results to determine if there are discrepancies. These discrepancies can then be corrected.
Error can occur in the analysis of our data. Most quantitative analyses use some type of statistical package, such as SPSS, SAS, Stata, and R, and many qualitative analyses use some type of computer program, such as NVivo or Atlas.ti. A simple type of mistake might occur in writing the data-definition statements that create the variable labels, value labels, and designate the missing values. A much more difficult type of error is using the wrong type of statistical analysis. Our best advice is to talk with a statistical consultant if there is any doubt about the proper method of analysis.
Error can occur in the reporting of data. For example, if we conducted a telephone survey of households in our county and we only sampled landline numbers, it would be an error to claim that our findings apply to all households in the county. This would be an example of overgeneralization. Rather, we should generalize to all households with landline numbers. We’ll discuss reporting further in Chapter 5 (Volume II).
Summary
Here’s a brief summary of what we have covered in this chapter.
Annotated Bibliography
Total Survey Error
Sampling Error
Other Types of Survey Error
iWe discussed sampling in an earlier chapter, so we’re not going to revisit the details of sampling here.
iiAnother problem occurs when elements that are not part of the population are included in the sampling frame. Sometimes this can be dealt with by screening. For example, in a phone survey some phone numbers that are outside the geographical area you want to cover might be included in your sampling frame. If you are aware of this possibility, you could include a screening question in which you ask if the household is in the desired geographical area.
iiiThis is often referred to as a multistage cluster sample.