Item Scaling
The first step in collecting unbiased, reliable, and valid survey data is having an unbiased, reliable, and valid survey instrument. The type of scale you use sets the foundation of the survey instrument itself, and often determines the kind of statistical analysis you will conduct.
A scale is defined as a series of item anchors that progress in value or magnitude, which offer some meaningful and quantifiable distinction among responses to a survey question. In other words, a scale is a continuous spectrum or series of categories. The purpose of scaling is to represent—usually quantitatively—an item’s, a person’s, or an event’s place in the spectrum.
The four common types of scales in business research are as follows:
The four scale types are depicted in Exhibit 3.1.
The Likert scale, which is an ordinal scale, is one of the most common scales you will come across in survey research. The original Likert scale was developed in 1932 by Rensis Likert, a psychologist in England, who was interested in measuring people’s opinions or attitudes on a variety of items. He developed a 7-point, bipolar agreement scale as a result. This scale has been used since then, and is probably the most widely used scale aside from the dichotomous yes/no scale. Often, if you are asked to create a questionnaire that uses a rating scale, the requestor likely means they want responses to their questions to be on a Likert-style scale. When using the Likert-style scale, one of the first questions to ask is how many anchor points you want to include. The most common number of anchor points are 5 or 7, and they usually range from Disagree (or Strongly Disagree) to Agree (or Strongly Agree). Examples of Likert scale anchors are given in Exhibit 3.2.
Exhibit 3.2
Likert Scales
1 = Strongly unfavorable to the concept
2 = Somewhat unfavorable to the concept
3 = Neutral
4 = Somewhat favorable to the concept
5 = Strongly favorable to the concept
or
1 = Extremely unfavorable to the concept
2 = Strongly unfavorable to the concept
3 = Somewhat unfavorable to the concept
4 = Neutral
5 = Somewhat favorable to the concept
6 = Strongly favorable to the concept
7 = Extremely favorable to the concept
or
1 = Agree
2 = Neutral
3 = Disagree
There are trade-offs between choosing a 5-point scale and a 7-point scale. The 7-point scale provides more choices for the respondent and a more detailed response set for the analyst, which allows for more variability in responses; however, more choices may, in turn, lengthen the time to complete the survey. Response options that are too detailed may require more thought and time to complete: “Was I really somewhat satisfied or just satisfied?” Furthermore, the additional two anchors of a 7-point scale may not contribute substantially to your data, depending on the purpose or intent of the questions you are asking.
On the other hand, having too few choices (i.e., 5 anchors or less) can be frustrating to the respondents, especially if none of the anchors truly fit their actual opinion. A 3-point scale is displayed at the bottom of Exhibit 3.2. Here, you can see how limiting the smaller scale is in terms of understanding the respondents’ true answers.
Each survey respondent is given a summated score (i.e., the sum of their ratings for all items) based on their responses to the survey. This summated score serves as the “final score” for each respondent. On some scales, you may have items that are actually reversed from the normal direction of the scale based on the question you’re asking. These are called reverse scored items. You will need to reverse the response value for each of these items before summing the total. When using reverse scored items, you will need to recalculate the directionality of the item to ensure it is appropriately scored. In Exhibit 3.3, Item 1 has a negative response associated with a higher numeric scale value; however, Item 2 has a negative response associated with a lower numeric scale value. The scoring of Item 1 would have to be reversed before it could be summed with Item 2.
Bipolar questions offer positive, neutral, and negative choices. Whenever you are dealing with bipolar answers, it is important to have an odd number of choices, no matter how large the scale. This becomes even more important when using a Likert scale with qualitative data. In this situation, the respondent is forced to decide whether they lean more toward the side of agreement or disagreement for each item. Exhibit 3.4 illustrates bipolar and unidirectional Likert scales.
In Exhibit 3.4, the first bipolar scale is designed to allow respondents to agree, disagree, or remain neutral about their level of satisfaction with their recent bonus check. The second unidirectional scale is designed to only capture positive responses. In this instance, it is probably safe to say everyone surveyed would feel at least somewhat positive to get a surprise bonus check for $5,000; so, the scale only begins at a neutral response. These unidirectional scales are easily converted to quantitative scales if they are designed correctly, which allows for a rigorous analysis.
If the bipolar scale only had 4 points, we would not have been able to see neutral responses; however, often less rigorous surveyors will simply average the responses to artificially create a neutral response, as shown in Exhibit 3.5. This is inaccurate and should be avoided.
Researchers have the benefit of selecting their own measuring system; however, the first question they should ask themselves is “What should I measure?” Answering this question is not as simple as it may seem. Both defining the concept you want to measure, and figuring out how best to measure it are complex questions, in part because there is often more than one way of measuring a single concept. Further, true measurement of concepts requires a process of precisely assigning scores or numbers to the attributes of people or objects. To have precise measurement in business research requires a careful conceptual definition, an operational definition, and a system of consistent rules for assigning numbers or scales.
Attitude Measurement
One of the most common reasons for collecting survey data is to measure employee or consumer attitudes. Due to the importance of this application, the next few pages are dedicated to this topic.
To measure attitude effectively, we first need to operationalize the term “attitude.” Attitudes are an enduring and consistent disposition, thought, or feeling about someone or something, including persons, events, and objects, that is expressed in someone’s consequent behavior or manner.
Attitudes are composed of three components:
Attitudes are considered latent constructs, or variables that are not directly observable, but measurable by an indirect means, such as verbal expression or overt behavior. Obtaining verbal statements from respondents generally requires that the respondent perform a task such as ranking, rating, sorting, or making a choice or a comparison.
The most common techniques for measuring attitudes include the following:
The Constant Sum Sorting Scale is a technique wherein the respondents are asked to allocate a constant sum of units, such as points, dollars, chips, or chits among the stimulus objects according to some specified criterion. In other words, a Constant Sum Sorting Scale is a scaling technique that involves the assignment of a fixed number of units to each attribute of the object, reflecting the importance a respondent attaches to a given object. This Constant Sum Sorting Scale works best with respondents having a higher education level; the results will approximate interval measures. An example of a Constant Sum Sorting Scale is given in Exhibit 3.8.
Exhibit 3.8
Constant Sum Sorting Scale
Divide 100 points among each of the following brands according to your preference for the brand:
Brand A _______________
Brand B _______________
Brand C _______________
Of the techniques for measuring attitudes listed above, Rating Scales are perhaps the most common form of measuring attitudes. Some examples of rating scales are as follows.
To measure attitude, researchers assign scores or weights, which are not printed on the questionnaire, to the answers. Strong agreement, for example, might indicate the most favorable attitudes on whatever question or statement is being presented about an object or topic. For this response, the weight of five would be assigned to indicate “Strongly Agree.” If, in this same question set, a negative question or statement were presented about the same object or topic, the weights would then be reversed and a response of “Strongly Disagree” would be assigned the weight of five. The total score is the summation of the weights assigned to an individual’s total responses
In the Likert procedure, many statements must first be generated to assess or describe a certain construct. Once the initial item list has been created, an item analysis is then performed to determine which of those initial items are the strongest. The strongest items are retained for the final scale. The purpose of the item analysis is to ensure the items retained are the strongest predictors of positive or negative responses, and therefore, truly discriminate among those with positive and negative attitudes. Items that are poor, because they lack clarity or elicit mixed response patterns, are eliminated from the final statement list. Questions with no variation are also removed. This step in the design of a questionnaire is too often neglected by business researchers, but is essential to truly ensuring your questionnaire is as strong as possible.
Exhibit 3.10 provides examples of wording on rating scales for various attributes.
Sometimes, the results are displayed graphically to provide a quick overall profile of the findings, as shown in Exhibit 3.11. In this example, an 11-point Likert rating instrument was used to compare the attitudes of consumers to two different airlines (indicated by either a solid or broken line). The horizontal lines indicate that there were 14 questions, along which both horizontal lines are graphed. The ticks on the lines correspond to the rating values from one to eleven (left to right). Positive attitudes are indicated on the left; conversely, negative attitudes are indicated on the right.
A weight is assigned to each position on the rating scale. Traditionally, scores are 7, 6, 5, 4, 3, 2, 1 or +3, +2, +1, 0, −1, −2, −3. Many business researchers assume that the semantic differential provides interval data, but some critics argue that the data has only ordinal properties since the weights are arbitrary.
The graphic scale has the advantage of allowing the researchers to choose any interval they wish for the purposes of scoring. The disadvantage of the graphic scale is that there are no standard answers.
A frequently used variation on the graphic scale design is the scale ladder; this and other picture or graphic response options enhance communication with respondents.
A table summary comparison of the rating scales is provided in Exhibit 3.13.
A question that poses some problem and asks the respondent to answer in his or her own words.
What things do you like most about your job?
What names of local banks can you think of offhand?
What comes to mind when you look at this advertisement?
Do you think that there are some ways in which life in the United States is getting worse? Please explain why you feel this way.
A question in which the respondent is given specific limited alternative responses and asked to choose the one closest to his or her own viewpoint.
Did you work overtime and/or did you work at more than one job this past week?
Yes ____ No ____
Compared to ten years ago, would you say that the quality of most products made in Japan is higher, about the same, or not as good?
Higher ____ About the same ____ Not as good ____
How much of your shopping for clothing and household items do you do in discount stores? Would you say:
All ____
Most ____
About half ____
About one-quarter ____
Less than one-quarter ____
In management, is there a useful distinction between what is legal and what is ethical?
Yes ____ No ____
In Aesop’s fable “The Ant and the Grasshopper,” the ant spent his time working and planning for the future, while the grasshopper lived for the moment and enjoyed himself. Which are you more like?
The ant The grasshopper
A question that requires the respondent to choose one of two dichotomous alternatives.
Did you make any long-distance calls last week?
Yes No
A type of fixed-alternative question that requires a respondent to choose one (and only one) response from among several possible alternatives.
Please give us some information about your flight. In which section of the aircraft did you sit?
First class Business class Coach class
A measure of attitude consisting of a graphic continuum that allows respondents to rate an object by choosing any point on the continuum.
Please evaluate each attribute in terms of how important it is to you by placing an “X” at the position on the horizontal line that most reflects your feelings.
Seating comfort Not Important _____ Very Important _____
In-flight meal Not Important _____ Very Important _____
Air fare Not Important _____ Very Important _____
For this question, select the positive numbers for words you think accurately describe the supervisor. Larger positive numbers indicate greater accuracy of the word in describing the supervisor. Select the negative numbers for words you think do not accurately describe the supervisor. Larger negative numbers indicate less accuracy of the word in describing the supervisor. Therefore, select positive numbers for words that you think are very accurate, and select negative numbers for words that you think are very inaccurate.
A type of fixed-alternative question that asks about the general frequency of an occurrence.
How frequently do you watch the television channel, MTV?
Every day....................................................................
5–6 times a week........................................................
2–4 times a week........................................................
Once a week...............................................................
Less than once a week.................................................
Never..........................................................................
Below are examples of behavioral questions. Typical behavioral questions include “I” statements or phrases.
I would write a letter to my Congressman or other government official in support of this company if it were in a dispute with the government.
How likely is it that you will change jobs in the next six months?
The U.S. Census Bureau has used a scale of subjective probabilities, ranging from 100 for “absolutely certain” to 0 for “absolutely no chance,” to measure expectations. Management researchers have used the following similar subjective probability scale to estimate the chance of job candidates accepting a position, if they were offered:
____ 100% (Absolutely certain) I will accept
____ 90% (Almost sure) I will accept
____ 80% (Very big chance) I will accept
____ 70% (Big chance) I will accept
____ 60% (Not so big a chance) I will accept
____ 50% (About even) I will accept
____ 40% (Smaller chance) I will accept
____ 30% (Small chance) I will accept
____ 20% (Very small chance) I will accept
____ 10% (Almost certainly not) I will accept
____ 0% (Certainly not) I will accept
Selecting a Measurement Scale
There is no single “best” scale that applies to all research projects. The scale you choose will be a function of the nature of the attitudinal object being measured, the manager’s defined problem, and/or the linkages to other choices that have already been made (e.g., telephone survey vs. mail survey). There are several issues that will be helpful to consider: