CHAPTER ONE

DIVERSITY BONUSES: THE IDEA

The power of a theory is exactly proportional to the diversity of situations it can explain.

—ELINOR OSTROM, Governing the Commons: The Evolution of Institutions for Collective Action

On April 8, 1865, one week before his assassination at Washington’s Ford’s Theatre, Abraham Lincoln visited a field hospital near Petersburg, Virginia. To raise morale among the wounded troops, Lincoln picked up an ax and began chopping wood. As a youth, he had split thousands of fence rails to earn money or goods in kind—he was once paid in dyed brown cloth sufficient to make him a pair of trousers. On the day of his visit to the field hospital, he demonstrated to all assembled that the famed “Rail Splitter” could still “make the chips fly.”1

Suppose that you had to hire a team of people to split rails. You would look for strong, tall people like Lincoln who are best at splitting rails. The logic borders on the tautological: the best team of rail splitters consists of the best individuals.

That logic makes sense because splitting rails is a separable task. The number of rails split by the team equals the sum of the rails split by each person. That logic does not apply for teams of people who work on the complex tasks we confront in our modern, information-rich society. In those settings, a team’s performance depends on the diversity as well as the ability of its members. As a result, a policy of hiring the best does not make sense on high-dimensional tasks. The best team will not consist of the “best” individuals. It consists of diverse thinkers.

The idea that diverse ways of thinking can lead to deeper insights is not new. It can be found in the writings of Aristotle. Lincoln himself applied a logic of diversity when appointing his cabinet. He did not create an echo chamber of like-minded people. He chose a diverse cabinet, the famed team of rivals.2 He opted for diversity partly to build political consensus but primarily because he faced complex problems. As he wrote in his December 1862 message to Congress, “The occasion is piled high with difficulty, and we must rise with the occasion. As our case is new, so we must think anew and act anew. We must disenthrall ourselves.”

We too must disenthrall ourselves. We now operate and interact in a complex world in which we work with our minds, not our backs. We must therefore also think anew. We must abandon the narrow and demonstrably false belief that we should admit, hire, and promote those who perform best according to a common standard. As I show later in this book, those who score highest will tend to be similar. Hiring “the best” will reduce the diversity of our scientific teams, our planning commissions, and our boards of directors, and with it their collective potential.

On the complex tasks we now carry out in laboratories, clean rooms, boardrooms, courtrooms, and classrooms, we need people who think in different ways. And not in arbitrarily diverse ways. Effective diverse teams are built with forethought. Not all teams of rivals will succeed. Not all multitudes possess wisdom. To realize the benefits of diversity, we need logic and theory to identify the types of diversity that improve outcomes and to understand the conditions under which they do so. And then we need practice.

Getting the logic correct takes precedence. Otherwise, we cannot compose the best possible teams, and we limit what we can achieve even with practice. That is the main reason for this book: to help us get the logic right. To get us to embrace the contrary assumption and to make our world better.

In this chapter, I sketch the core logic for how diversity produces bonuses. That logic relies on linking cognitive diversity, which I define as differences in information, knowledge, representations, mental models, and heuristic, to better outcomes on specific tasks such as problem solving, predicting, and innovating. Cognitive diversity differs from identity diversity—differences in race, gender, age, physical capabilities, and sexual orientation. That said, identity diversity, along with education and work and life experience, will be a contributor to those differences. For the moment, we will keep them separate.

DIVERSITY BONUSES ON COMPLEX TASKS

To sketch the core logic, I borrow a stripped-down model that I developed with Jon Bendor. This model reduces cognitive repertoires to collections of tools.3 Think of these tools as analytic analogues of a carpenter’s tools. A carpenter has a chainsaw; a mathematician knows the chain rule. A carpenter attaches boards with a nail gun; a plant biologist inserts DNA with a gene gun.

I use that model to show the logic of how diversity bonuses arise. I then connect assumptions about the diversity of tools that people possess to the complexity of the challenge or opportunity at hand. That second step includes two purposefully incomprehensible graphs.

In the tool-based model, I assign a unique letter to each tool. Figure 1.1 shows three people and their cognitive tools. Define ability of a person to equal the number of tools she knows. Ann possesses five tools, so she has an ability of five. Barry, in the center, has ability four, and Cam has ability three. Ann is the best.

image

Figure 1.1  Three People and Their Cognitive Tools

Similarly, define the diversity of the team to equal the number of unique tools collectively known. If two team members know the same tool, that counts as a single tool. Next, define the diversity of a team as the percentage of the team’s tools known by a single person.4 Two people with no tool overlap have 100 percent diversity. Two people with the same tools have 0 percent diversity.

This simple model produces two types of diversity bonuses. First, a diversity bonus occurs if someone adds a unique tool. When this happens, we can add someone of less ability to a group and make the group smarter. That’s a bonus. In addition, if teams can apply combinations of tools, then adding a person with a new tool produces new combinations, a second bonus.

Working through the example clarifies how bonuses arise. Figure 1.2 shows two possible teams and the union of their cognitive tools. The first team consists of the two highest-ability people, Ann and Barry. Ann has an ability of five. Adding Barry to the team adds one more tool, giving the team an ability of six. That’s a bonus of one. The second team consists of Barry and Cam. Barry has an ability of four. The team of Barry and Cam has an ability of seven. Thus, when paired with Barry, Cam produces a diversity bonus of size three. If Cam were paired with Ann, he would only produce a bonus of size one. Thus, the bonus someone produces depends on the team.

image

Figure 1.2  Two Teams and Their Cognitive Tools

As suggested, counting the number of tools understates the potential diversity bonuses if tools can be applied in combination. We will start with the team of Ann and Barry. Ann knows five tools, so she could apply up to ten unique pairs of tools. She and Barry know six tools, creating fifteen unique pairs. By adding Barry, she gets one new tool and five new pairs of tools.

Barry, on the other hand, knows four tools and therefore six unique pairs of tools. When Cam is added to Barry’s team, Cam adds three tools, as well as three pairs of tools: {CD, CG, DG}. When combined with Barry’s four tools, his three tools create twelve new pairs of tools.5 Thus, Cam adds fifteen potential pairs of tools. That is what is meant by a diversity bonus.

Note also that Barry and Cam are the best team of size two. That team jointly possesses the most tools and therefore has the most ability. In this example, as in many others that will follow, the best team does not consist of the two highest-ability people.

Complexity and Bonuses

This example would seem to create diversity bonuses without any reliance on complexity; that is, on the task being part of a context that is difficult to predict, explain, or design. That is not true. The assumptions that I made on the tools that the people possess imply multiple relevant knowledge bases and types of approaches to solving the problem. To see why requires a second example and then some assumptions on the structure of tools.

The second example consists of three people with the tool sets shown in figure 1.3. Notice that no diversity bonuses can arise in this example. The best person knows every tool of the second-best person, who in turn knows every tool of the third person. The best team will be any team that includes that best person.

A comparison of the tool sets in figures 1.1 and 1.3 reveals the key insight. The cognitive tools that people possess in the first example are idiosyncratic. The cognitive tool sets in the second example would come about only if people accumulate tools in the same order, that is, if a person had to learn tool A before tool B and tool B before tool C.

image

Figure 1.3  Three People Who Produce No Diversity Bonus

As an analogy, think of people riding on a train from Chicago to Los Angeles. At each stop along the way, the conductor tells the history of the station. If one person stays on the train longer than another, that first person learns about more stations than the second. She necessarily knows about every station that the second person knows about.

The cognitive tools shown in figure 1.1 do not satisfy that condition. Here, the person with the fewest tools knows tools the most talented person does not. For this configuration to occur, it must be that tools need not be acquired in a single order. Instead of a train trip, a trip to the zoo would be a more appropriate analogy. One person might spend a full day at the zoo and visit five exhibits (Alligators, Bears, Camels, Ducks, and Elephants). A second person might leave midafternoon after taking in only three exhibits (Camels, Ducks, and Gorillas). The second person learns less, but she gains knowledge of gorillas that the first person does not have. The first person does not know everything the second person knows.

Figure 1.4 represents these two possibilities in network form. Assume that a person must first learn a tool on the left edge and then can follow any path. The upper path corresponds to the train ride. Diversity doesn’t matter. The best team consists of the best person. Ability rules.

The lower path represents the trip to the zoo. As shown in figure 1.5, the tool sets in the first example can be constructed within this network. Ann can follow a path that leads to A, B, C, D, and E. Barry can learn A, B, E, and F, and Cam can learn tools C, D, and G. The fact that a person can know fewer but different tools means that someone can have less measured ability than the people already in a group but still contribute.

image

Figure 1.4  Linear and Network Arrangements of Cognitive Tools

image

Figure 1.5  How Tool Structure Influences Cognitive Diversity

The remaining step in the logic connects the value of diversity to complexity. The intuition will be straightforward: Our accumulation of knowledge, representations, techniques, and models produces elaborate networks of what I am calling tools. This allows people to construct distinct tool sets. That need not be the case for less developed bodies of knowledge, which often create linear orders.

image

Figure 1.6  Relationships among Topics in Elementary School Mathematics

As an example, consider topics in mathematics. Figure 1.6 shows the relationships between the topics covered in elementary school mathematics. The topics build on one another in a linear fashion. You need to be able to count in order to add, to add in order to multiply and divide, to multiply and divide in order to understand fractions, and to understand fractions in order to define the trigonometric functions sine and cosine. These topics can be represented in a linear order.

In contrast, the advanced mathematical topics in figure 1.7 connect in multiple ways. This is the first incomprehensible graph. To approach a network of this complexity, ignore the technical terms and focus on the many boxes and arrows. Notice that there exist multiple paths a student could pursue. Parts of the network can be understood by anyone. For instance, in the middle of the figure, the integers (1, 2, 3, and so on) point to the rational numbers image, which in turn point to the real numbers (π = 3.1415…). To know the real numbers, a person must first understand integers and fractions. That portion of the network looks like the linear elementary school network.

image

Figure 1.7  Relationships among Topics in Graduate Mathematics (courtesy Tegmark, “Ultimate Ensemble Theory?”)

Making sense of other parts of the network requires deeper technical knowledge. The graph implies that a person could master Lie groups (in the upper left) without knowing Hilbert spaces, distributions, or quantum field theory (in the upper right). The implication is that the tool sets of professional mathematicians would look like those in our first example. And each mathematician would add diversity to the group.

Making breakthroughs in mathematics often involves combining different tools. A report by the National Academy of Sciences describes “an increasing need for research to tap into two or more fields of the mathematical sciences.”6 Tapping into two fields implies a diversity bonus. Something that could not be proved using either field alone can be solved with tools from two fields.

That same report notes the growing connections between mathematics and other fields including defense, entertainment, physics, economics, computer science, linguistics, manufacturing, finance, and biology. These connections reflect a broader trend toward multidisciplinary inquiries. That can be explained by the complexity of modern challenges and opportunities.

Consider the rise of obesity. Some call it an epidemic. Fifty years ago, we might have placed the challenge of reducing obesity within the domain of nutritional sciences. We now understand that it has myriad causes that cross disciplines.

Figure 1.8 characterizes one attempt to explain the obesity epidemic with arrows denoting causal forces from the Foresight Group in the UK.7 It is meant to be overwhelming. (Yes, this is the second incomprehensible graph.) The disciplinary knowledge embedded in the graph crosses economics, nutrition, physiology, sociology, biology, media studies, advertising, transportation and infrastructure, and genetics.

image

Figure 1.8  Obesity Knowledge Structure (based on Vandenbroeck, Goossens, and Clemens, Foresight Tackling Obesities: Future Choices—Building the Obesity System Map. Government Office for Science, UK Government’s Foresight Programme, 2007, http:www.foresight.gov.ukObesity12.pdf. Accessed June 16, 2009)

No one person can understand all of the boxes and arrows in the full diagram, and no one person will find a cure for obesity. At best, the combined wisdom of a multitude of diverse people can march us toward less obesity through a constellation of interventions. The types of diversity necessary to find ways to reduce obesity differ from the types needed to advance mathematical knowledge. Though in each case, the fact that diversity in the background knowledge one acquires through education will be of central importance hardly needs explaining.

Aside: John Milton and TMI & TMK

The explosion of available information and knowledge creates a second reason we need diversity on complex tasks. On a simple task like building a tool shed, a single person might have sufficient knowledge, but that is not true for complex tasks like reducing obesity, calculating a supply chain, managing and investing a portfolio, or advancing the field of mathematics. In those domains, no single person can master all relevant knowledge. There exists too much information (TMI) and too much knowledge (TMK).

TMI and TMK imply the necessity of diversity. When Euclid was writing his axioms, a person could learn all of mathematics. That is not true today. Think back to the diagram of mathematical knowledge. There’s too much to know.

The polymath John Milton might well be the pivotal person in the transition from all-knowing experts to a world of intelligent people with diverse repositories of information and knowledge. Born in 1608 in Cheapside in the city of London, England, Milton introduced more than six hundred words into the English language. He traveled the world, discussing ideas with luminaries from Galileo to Isaac Newton.

In 1640, at the peak of Milton’s career, the British Library contained fewer than forty thousand books. A voracious reader in a dozen languages, Milton learned a substantial percentage of what was knowable. Reading two books a day, he could have read fifteen thousand books by age thirty and, by age fifty, could have made his way through the bulk of the British Library.

With nearly one and a half million books now published each year, a modern Milton could not make it through a week’s production of knowledge. To finish the quarter million books published by traditional book publishers each year would involve reading fifteen books a day for fifty years, leaving little time to digest the more than one and a half million academic papers published annually.

A modern Milton can only know a thin slice of what’s knowable. That observation holds true within academic disciplines, professions, and industry. The information and knowledge produced within organic chemistry, oncology, or economic sociology overwhelms the capacity of any one person. Hence, we need teams. And we need teams that include people with diverse information and knowledge.

IDENTITY AND COGNITIVE DIVERSITY

The Milton problem demonstrates the necessity of working with people who explore different parts of the library and the web. Many of our complex challenges involve understanding the actions, preferences, and capabilities of diverse people. Thus, identity diversity also contributes relevant cognitive diversity.

The aforementioned efforts to reduce obesity require understanding economic, social, and psychological influences on behavior, as well as the impact of media. Our understandings of those dimensions will benefit from identity-diverse teams. The analysis of the effects of infrastructure will benefit from people from different geographic regions and from urban and rural locations. Overall, identity diversity may weigh in with similar magnitude as disciplinary diversity. To not include any men, or any women, on a team formulating an obesity-reduction program would be as shortsighted as not including a geneticist or a psychologist.

The connections between identity diversity and relevant cognitive diversity in mathematics are less obvious. Could gender, race, ethnicity, or physical capabilities influence the representations and analytic tools a mathematician applies? Sure. In mathematical research, identity is less germane than academic training, though it is possible that a person’s identity could influence how she represents a mathematics problem as well as the problems she chooses to tackle. That’s truer for the frontiers of math, where mathematicians often rely on analogies and knowledge from other experiences.

The lack of an obvious logic linking identity diversity to germane cognitive diversity in fields like math or physics does not mean that those fields do not need to be inclusive. On the contrary, because mathematics community confronts hard problems, it needs cognitive diversity.

Permit me a slight digression to make a larger point linking inclusion to cognitive diversity. Define the capacity of a mathematician as the number of tools she can acquire. We can think of her career as traversing a path in figure 1.7. A great mathematician might learn about twenty topics, a good one only fifteen. Excluding some identity groups from being mathematicians or making the field less attractive to some groups results in a cohort of mathematicians with lower overall capacity. If a woman with a capacity of twenty opts out of mathematics, and a man with capacity sixteen replaces her, then mathematics suffers. The profession loses talent because she has more capacity, and it loses diversity because of her larger capacity.

Fifty years ago, people chalked up the low representation of women and some racial groups in mathematics, and science generally, to a lack of interest—“Women do not want to become physicists.” As recently as twelve years ago, some attributed the low numbers in these professions (offensively, I might add) to a lack of cognitive ability. Current thinking points to the effects of limited opportunities and exposure, the lack of role models, and the effects of noninclusive behaviors and discrimination.

Personal accounts of women who entered school with the interest and ability to excel at mathematics and science but pursued other paths reveal the accumulated dampening of interest produced by repeated acts of discrimination. Some actions were overt and direct. Others were subtler. Combined, they made science an unwelcoming place.

As an undergrad, I took a two-year math sequence listed as Honors Track II that students referred to as “math for gods.” Lacking any training in calculus, I struggled during the first two courses. Recently, I looked up three students who had excelled in those classes. All three have enjoyed successful careers. One works as the chief actuary and risk officer at a large insurance company. A second serves as a chaired professor of law at the University of Chicago. The third, the only woman of the three, began her career in engineering, rose to become a senior software engineer, and now works as a life coach, facilitator, and counselor.

Personal accounts of women who tried to pursue scientific careers reveal any number of obstacles, both direct and indirect.8 The fact that the two men remain in technical fields and the one woman opted out is not surprising, but it is disheartening. We lose talent and diversity when environments are not inclusive.

Data gathered by the National Science Foundation reveal low representation of women and minorities in many technical fields, and we cannot but infer lost diversity bonuses. In 2013–2014, 1,200 US citizens earned PhDs in mathematics. Of these scholars, 12 were African American men and just 6 were African American women. From 1973 to 2012, over 22,000 white men earned PhDs in physics, as compared to only 66 African American women and 106 Latinas. Those numbers translate into over 550 white men and fewer than 2 black women earning PhDs each year. Over that same time period, about 15 Asian American women earned physics PhDs each year.9

In addition, recall how mathematics connects to other disciplines and how those connections can produce bonuses. A person may apply his mathematical tools to a problem that leverages identity-based knowledge or interests.

Thus, even if we see no obvious direct links between identity and relevant cognitive diversity within a technical field, diversity and inclusion produce bonuses by increasing the pool of talent and the range of problems studied. Think back first to the complicated graph of mathematical knowledge. People with greater capacity can trace out longer paths in that graph. Their talent adds diversity. In addition, on cross-disciplinarity complex tasks like the obesity epidemic or rising opioid use, identity-based knowledge or perspectives become germane, and identity brings relevant cognitive diversity.

LESSONS FROM THE TOOLBOX MODEL

The toolbox model reveals how complexity, whether within a field like mathematics or in the context of a problem like obesity, creates the potential for diversity bonuses. If the domains were not complex and tools were arranged linearly, the smartest person would know everything that everyone else knows. When tools can be acquired in any number of orders and there exist a large number of relevant tools—that is, when the domain is complex—the potential for diversity bonuses exists.

Complexity and Diversity Bonuses

If cognitive tools must be accumulated in a particular order, like the stations on a train trip, then the best team consists of the highest-ability person and no diversity bonuses exist. If cognitive tools can be accumulated along multiple paths, that is, if the field (mathematics) or the challenge (reducing obesity) is complex, then diversity bonuses can exist because different people master different relevant tools.

The toolbox model represents people as possessing a collection of tricks or techniques to solve problems. If a person possesses different tools, then she produces bonuses. The same logic described with respect to these tools can be applied to the various parts of a person’s repertoire: her information, knowledge, models, representations, or heuristics. When there exist only a few tools that must be acquired in a specific order, then we should not expect bonuses. A single person could master all the tools necessary. We need not build teams or seek diversity bonuses. When repertoires can be accumulated along multiple paths and when there exist an abundance of relevant ways of thinking for some task, then diversity bonuses will exist.

Like any model, this tool model oversimplifies. It assumes that everyone trusts and understands one another, that people can recognize improvements, and that no communication costs (or other costs, for that matter) arise when enlarging the team. Without any costs to scaling, the model implies that we should make teams as large as possible. Larger teams would possess more tools and be more likely to excel at a task. In real situations, communication and coordination costs rise with team size, so even though more people would mean more cognitive tools, larger teams need not perform better.

THE (INAPT) PORTFOLIO ANALOGY

I have found that the most common explanation that people give for the benefits of identity diversity rests on a portfolio analogy from finance. That analogy is inapt and unfortunate. Diversity bonuses are not at all the same as portfolio effects. Not only does portfolio thinking offer little guidance for how to hire employees or assemble teams, it also systematically understates diversity’s contribution.

The portfolio analogy can be stated as follows: Fund managers invest in a variety of diverse stocks to earn robust returns. By analogy, organizations should create identity-diverse and cognitively diverse teams. For the analogy to be useful, the benefits fund managers receive from diverse investments must be analogous to the benefits organizations receive from diverse people. That is not true.

Fund managers select diverse investments to reduce variation in returns—to lessen risk. Organizations want diverse employees for different reasons. Why they want diversity and the type of diversity they want depends on the task. For example, organizations want diverse problem-solving teams because those teams come up with more ideas that they can recombine to produce bonuses. They want diverse forecasting teams because those teams make more accurate predictions. In both cases, diversity produces bonuses. It does not reduce risk.

A more detailed comparison of investment portfolios and teams of people reveals that the mechanisms through which diversity operates also differ. When building an investment portfolio, a fund manager wants high return and low risk. As a rule, higher-return investments come with higher risk. That follows from economic logic: if high-return, low-risk investments were available, they would attract many investors. This would raise the price of those investments and lower their returns.

A fund manager therefore must accept risk to earn high returns. A manager can earn (relatively) high returns with low risk by investing in negatively correlated stocks, that is, a diverse portfolio. Figure 1.9 shows a portfolio containing four stocks: a technology stock that returned 4 percent, an oil stock that returned 9 percent, an airline stock that lost 4 percent, and an automobile stock that earned 3 percent.

When the fund manager made these investments, she did not know what their returns would be. The returns depend on what financial analysts call the state of the world. No one can know the state of the world a year ahead of time. The idea is to select a portfolio of stocks that pays well regardless of what happens in the economy.

In our example, perhaps the airline stock lost money because of high energy prices. Airline stocks suffer under those conditions. Luckily, in that same state of the world, oil stocks perform well. Had a different state of the world arisen, the airline stock might have done well while the oil stock fell in price. This balancing, called negative correlation, explains why a fund that includes airline and oil stocks has less risk.

image

Figure 1.9  A Diverse Portfolio and Risk Reduction

The mathematics is not complicated. The fund manager’s return on the entire portfolio equals the average return from each investment. Diversification brings value by reducing the variation in that average return.

A team of diverse people solving a problem does not operate at all like a collection of stocks. The analogy that people have different payoffs depending on the state of the world is strained at best. Furthermore, the problem-solving team’s performance does not equal the average of its members. Instead, the team could ignore everything except the best solution. The analogue would be that after the state of the world was realized, the fund manager could drop every investment except for the oil stock and earn a 9 percent return. Of course, she cannot. She’s stuck with the average of her pool of investments.

Note the difference: the portfolio performs like the average. The problem-solving team performs like the best. Actually, the team can perform even better if team members share ideas. If they can improve on the best idea or combine it with another idea, they can do better than the best (see figure 1.10).

The relative importance of diversity in the two settings should be clear. If the return on a stock portfolio could exceed its best single investment, fund managers would construct much more diverse portfolios than they do at present. They would not care if an investment had a high probability of a large loss, provided it had some chance of generating a huge return. Thus, when applied to problem-solving teams, the portfolio analogy misstates the reasons for diversity and understates its value.

image

Figure 1.10  Diverse Problem-Solving Teams Can Outperform Their Best Member

On predictive tasks, the analogy also fails. Later, I show that the average of a diverse collection of predictions must be more accurate than the average prediction. The portfolio analogy therefore understates the contribution of diversity. Groups again do better than average. The portfolio analogue also undervalues the contribution of diversity for generating ideas, verifying truth, evaluating projects, and innovating. Nor will the analogy apply to situations in which diverse repertoires originate something new. In each case, diversity does more than reduce risk. It produces bonuses.

THE NETFLIX PRIZE

On October 6, 2006, Reed Hastings, the CEO of Netflix, announced an open competition to predict customers’ movie ratings. On that date, Netflix released data consisting of one hundred million movie ratings of one to five stars for seventeen thousand movies from their nearly half million users—the largest data set ever made available to the public. Contest rules were as follows: any contestant who could predict consumer ratings 10 percent more accurately than Netflix’s proprietary Cinematch algorithm would be awarded a $1,000,000 prize.10 Netflix had poured substantial resources into developing Cinematch. Improving on it by 10 percent would not prove easy.

The story of the Netflix Prize differs from traditional diversity narratives in which a single talented individual, given an opportunity, creates a breakthrough because of some idiosynchratic piece of information. Instead, teams of diverse, brilliant people competed to attain a goal. The contest attracted thousands of participants with a variety of technical backgrounds and work experiences. The teams applied an algorithmic zoo of conceptual, computational, and analytical approaches. Early in the contest, the top ten teams included a team of American undergraduate math majors, a team of Austrian computer programmers, a British psychologist and his calculus-wielding daughter, two Canadian electrical engineers, and a group of data scientists from AT&T research labs.

In the end, the participants discovered that their collective differences contributed as much as or more than their individual talents. By sharing perspectives, knowledge, information, and techniques, the contestants produced a sequence of quantifiable diversity bonuses.

Winning the Netflix Prize required the inference of patterns from an enormous data set. That data set covered a diverse population of people. Some liked horror films. Others preferred romantic comedies. Some liked documentaries. The modelers would attempt to account for this heterogeneity by creating categories of movies and of people.

To understand the nature of the task, imagine a giant spreadsheet with a row for each person and a column for each movie. If each user rated every movie, that spreadsheet would contain over 8.5 billion ratings. The data consisted of a mere 100 million ratings. Though an enormous amount of data, it fills in fewer than 1.2 percent of the cells. If you opened the spreadsheet in Excel, you would see mostly blanks. Computer scientists refer to this as sparse data.

The contestants had to predict the blanks, or, to be more precise, predict the values for the blanks that consumers would fill in next. Inferring patterns from existing data, what data scientists call collaborative filtering, requires the creation of similarity measures between people and between movies. Similar people should rank the same movie similarly. And each person should rank similar movies similarly.

A team knows it has constructed effective similarity measures if the patterns identified in the existing data hold for the blanks. Characterizing similarity between people or movies involves difficult choices: Is Mel Brooks’s spoof Spaceballs closer to the Airplane! comedies or to Star Wars, the movie that Spaceballs parodied?

Early in the competition, contestants’ similarity measures of movies emphasized attributes such as genre (comedy, drama, action), box office receipts, and external rankings. Some models included the presence of specific actors (was Morgan Freeman or Will Smith in the movie?) or types of events, such as gruesome deaths, car chases, or sexual intimacy. Later models added data on the number of days between the movie’s release to video and the person’s day of rental.

One might think that including more features would lead to more accurate predictions. That need not hold. Models with too many variables can overfit the data. To guard against overfitting, computer scientists divide their data into two sets: a training set and a testing set. They fit their model to the first set, then check to see if it also works on the second set.11 In the Netflix Prize competition, the size of the data set and the costs of computation limited the number of variables that could be included in any one model. The winner would therefore not be the person or team that could think up the most features. It would be the team capable of identifying the most informative and tractable set of features.

Given a feature set, each team also needed an algorithm to make predictions. Dinosaur Planet, a team of three mathematics undergraduates that briefly led the competition in 2007, tried multiple approaches, including clustering (partitioning movies into sets based on similar characteristics), neural networks (algorithms that take features as inputs and learn patterns), and nearest-neighbor methods (algorithms that assign numerical scores to each feature for each movie and compute a distance based on vectors of features).

At the end of the first year, a team from AT&T research labs, known as BellKor, led the competition. Their best single model relied on fifty variables per movie and improved on Cinematch by 6.58 percent. That was just one of their models. By combining their fifty models in an ensemble, they could improve on Cinematch by 8.43 percent.

A year and a half into the competition, BellKor knew they could outperform the other teams, but also that they could not reach the 10 percent threshold. Rather than give up, BellKor opted to call in reinforcements. In 2008, they merged with the Austrian computer scientists, Big Chaos, a team that had developed sophisticated algorithms for combining models. BellKor had the best predictive models. Big Chaos knew better ways to combine them. By combining these repertoires, they produced a diversity bonus. However, that bonus was not sufficient to push them above the 10 percent threshold.

In 2009, the team again went looking for a new partner. This time, they added a Canadian team, Pragmatic Theory. Pragmatic Theory lacked BellKor’s ability to identify features or Big Chaos’s skills at aggregating models. Pragmatic Theory’s added value came in the form of new insights into human behavior.

They had developed novel methods for categorizing distinct users on the same account. They could separate one person into two identities: Eric alone and Eric with a date. These two Erics might rank the same movie differently. Pragmatic Theory also identified patterns in rankings based on the day of the week—some people rated movies higher on Sundays. They found that for some movies, rankings depended on whether people rated the movie immediately or after having time for reflection. As the credits roll, the hilarity of Snakes on a Plane or Anchorman results in high rankings. With time for reflection, most people no longer consider a flaming flute or a burrito in the face to be hallmarks of quality films and assign fewer stars.12

The combined team, now called BellKor’s Pragmatic Chaos, had thought up a jaw-dropping eight hundred predictive features.13 More diversity meant more ideas. Recall that the goal was not to come up with the most features. Not all the features would improve accuracy. The team had to select from among them to create powerful combinations. Eventually, the team developed a single model that improved on Cinematch by 8.4 percent. They now had a single model as good as BellKor’s entire ensemble of models. When BellKor’s Pragmatic Chaos combined that and other models, they produced even more accurate predictions.

The combined team’s composite models proved up to the task. On June 26, 2009, nearly three years after the contest began, BellKor’s Pragmatic Chaos surpassed the 10 percent threshold. Game over. BellKor’s Pragmatic Chaos won the $1,000,000 in prize money.

Although, not yet. They had to wait. To safeguard against the possibility that 10 percent would prove too easy, the organizers wrote the rules so that the contest would end thirty days after a team passed the threshold. Had the threshold been 5 percent, a level that was bested a mere six days into the contest, this decision would have been prescient. As events unfolded, this delay seemed unnecessary.

It was not. The fun had only begun. As if drawn from the script of Jurassic Park, the dinosaurs came roaring back. And they brought reinforcements. More than thirty teams, including top performers Grand Prize Team, Opera Solutions, and Vandelay Industries, joined forces with the Dinosaur Planet team to form the Ensemble. Within a few weeks, the Ensemble blended forty-eight models using a sophisticated weighting scheme and took the slightest of leads.

The ultimate winner would be decided by determining which model performed best on the testing data—the data held back by Netflix. The result was a tie. Each had improved on Cinematch by an identical 10.06 percent. The winner was determined by order of submission. By turning in their code twenty-two minutes before the Ensemble, BellKor’s Pragmatic Chaos won.

Winning the contest required knowledge of the features of movies that matter most, awareness of available information on movies, methods for representing properties of movies in languages accessible to computers, good mental models of how people rank movies, the ability to develop algorithms to predict ratings, and expertise at combining diverse models into an ensemble. What had begun as a contest to determine the best data scientist became a demonstration of diversity bonuses.

Some parts of those repertoires—algorithm development and model aggregation procedures—required deep technical knowledge and skills that might be learned in graduate school. Other parts—the ability to identify significant features and to construct models of how people rate movies—leveraged personal experience and social knowledge. The winning formula required deep knowledge of spectral analysis and appreciation of the superficial humor of Will Ferrell.

Passing the 10 percent threshold required a team with both diverse technical training and deep intuitions and knowledge about how a diverse consumer base rated movies. It required cognitive diversity. BellKor’s diversity had its roots in education, experience, and identity. The BellKor team consisted of Robert Bell, an African American statistician who spent two decades at Rand Corporation on the West Coast engaged in public policy research; Chris Volinsky, a European American data-mining expert from upstate New York who specialized in fraud detection and social networks; and Yehuda Koren, an Israeli expert in data mining and data visualization.

Despite their brilliance and diversity, BellKor could not exceed the threshold. They needed help. They turned to people demonstrably less capable than themselves. This approach runs counter to what we do on physical tasks. If incapable of unscrewing the lid from a jar of mayo, you do not seek out someone weaker than you are. You look for someone stronger. Bell and his team had no one smarter to call. They were the smartest. They therefore sought diversity bonuses, and they found them.

Ironically, diversity bonuses also almost cost them the prize. The runners-up who formed the Ensemble generated enough bonuses to forge a tie. None of these teams had models as accurate as the best models of BellKor’s Pragmatic Chaos. The Ensemble had less ability. Their strength lay in their diversity. The forty-eight models they combined relied on diverse features, embedded different knowledge, and applied diverse insights about how people rank movies. The Ensemble also made breakthroughs in combining those models using sophisticated weighting algorithms. Given more time, the Ensemble might well have won.

In the end, being smart was not enough. That was the key lesson. Exceeding the 10 percent threshold required different ways of thinking, seeing, solving, and coding. Eliot Van Buskirk (see box) noted the quantifiable diversity bonuses. The bonuses are not metaphorical musings. They are a measurable fact.

“The secret sauce for both BellKor’s Pragmatic Chaos and The Ensemble was collaboration between diverse ideas, and not in some touchy-feely, unquantifiable, ‘when people work together things are better’ sort of way. The top two teams beat the challenge by combining teams and their algorithms into more complex algorithms incorporating everybody’s work. In combination, the teams could get better and better and better.”

—Van Buskirk, “How the Netflix Prize Was Won”

The Netflix Prize story reveals diversity bonuses threefold. The diverse repertoires within the AT&T team produced bonuses, the diverse representations and tools brought to the team by Big Chaos and Pragmatic Theory produced bonuses, and so did the diverse models of the teams in the Ensemble.

Similar bonuses arise in any crowdsourced prediction contest. Kaggle, a platform launched in 2010, allows anyone to organize similar contests. Companies or nonprofits post data and a prize. Scientists, data miners, and computer scientists then compete to predict everything from brain seizures to home prices in Ames, Iowa. In Kaggle, participants construct code using scripts that anyone can access. Discussion boards reveal interchanges of ideas and diversity bonuses. A new version of the Netflix Prize story could be written each week, revealing more bonuses upon bonuses.

THE CONDITIONS FOR DIVERSITY BONUSES

The Netflix Prize story and Kaggle provide measurable evidence of diversity bonuses. To understand how and why diversity bonuses occur, we need a theory and models. That theory will help us move away from belief-based reasoning toward scientific understanding. Holding the conflicting ideological positions that “diverse teams perform better” and that “we should hire by ability” does not move us forward. We need to understand the conditions for bonuses to occur.

As an analogy, consider the claim that markets work, that is, that they allocate goods efficiently, and the alternative claim that markets fail. Each claim is one part ideological thinking and one part bad economics. The claims lack conditionality. An economics textbook will include neither claim. Instead, it will list conditions that must hold for each claim to hold.

For a market to work, the environment must not include negative externalities: unpriced consequences that lower the well-being of people not involved in a transaction. A person smoking a cigar in an elevator imposes a negative externality on other passengers. Markets without externalities, such as the markets for luggage, socks, and tablecloths, work rather well. No one cares much about the color of your socks.

In contrast, the real estate market includes any number of potential negative externalities. A person wishing to buy an empty lot in a residential neighborhood with the hope of building a five-story, yellow-stucco home to which he plans to affix enormous exterior audio speakers so that he can enjoy his collection of vintage Whitesnake albums would infuriate his neighbors if his vision became reality. The market, in this case, would not produce an optimal outcome.

To prevent negative externalities of this sort, communities enact zoning laws. They limit building height and preclude the playing of loud music. What constitutes a negative externality varies by location. Santa Fe, New Mexico, requires that a home’s exterior resemble that of a traditional adobe structure. Vinyl-sided McMansions would destroy Santa Fe’s charm, so the law forbids them. Holland, Michigan, caps the number of dogs at two. Ann Arbor, Michigan, allows chickens but no roosters. These regulations lead to quieter neighborhoods than would an unfettered market.

In sum, sometimes markets work, and sometimes they fail. Logic, models, evidence, and experiments help us to learn the conditions in which they work and to design better markets. People who believe markets always work ignore logic and evidence. They predict poorly.14 The same criticism applies to people who believe diversity bonuses always exist. That is also not true. Diversity exists only for some problems. We must think through the logic and determine when they do.

To organize our thinking, we can place each task in one of two boxes, as shown in figure 1.11: tasks for which no bonuses exist and in which ability dominates, and tasks for which bonuses exist and in which both ability and diversity matter. The Netflix Prize obviously belongs in the box on the right.

To determine how to allocate tasks to these boxes, I rely on a categorization from economics that distinguishes between physical and cognitive work and between routine and nonroutine work.15 We perform physical work with our hands and bodies. Cognitive work is done with our minds. Routine tasks can be handled by a program or machine. Nonroutine tasks differ from day to day or moment to moment.

Routine tasks need not be easy. Before the development of computers, people worked as calculators, summing up columns of numbers to keep track of inventories, revenues, and costs. Those jobs required mathematical expertise.

Those two distinctions create four categories of jobs, shown in figure 1.12: manual routine, manual nonroutine, cognitive routine, and cognitive nonroutine. Manual routine jobs include assembly-line worker, truck driver, and warehouse worker. Manual nonroutine jobs include those of many health care workers, hotel and restaurant employees, and fitness center employees. Routine cognitive jobs include positions in data entry, sales, and insurance.

Cognitive nonroutine jobs include those of medical researchers, doctors, lawyers, scientists, financial analysts, management, and policy makers. This category includes workers who solve technical problems and people who manage organizations. It can be further subdivided into analytic and interpersonal cognitive workers.

Significant diversity bonuses exist primarily for cognitive nonroutine tasks. As shown in figure 1.13, that category contains the largest proportion of workers. Those workers also earn the highest incomes, implying a significant potential impact of diversity bonuses.

The predominance of nonroutine cognitive workers is a recent phenomenon. A century ago, at the peak of Thomas Edison’s career, fewer than 5 percent of people worked in professional and technical jobs. Farmers and farm laborers constituted over a third of the workforce. Those people who did not farm worked mostly in routine manual jobs, often in mines, factories, or forests. Most people did not work in jobs that could produce meaningful diversity bonuses. That remained true through the middle of the twentieth century during the postwar growth period as workers shifted from farm to factory. The passage of the GI Bill, which enabled millions of (mostly white) soldiers to attend college, bootstrapped the trend toward more cognitive nonroutine work.16

image

Figure 1.11  Classifying Tasks

By the mid-1980s, cognitive routine workers composed the largest share of the workforce. Since that time, most job growth has been in that category as well. In 1983, the US workforce consisted of one hundred million people. The number of workers with nonroutine cognitive jobs, twenty-eight million, equaled the number working in manual labor. There were twenty-seven million workers engaged in routine cognitive work, and fifteen million in nonroutine manual labor.

image

Figure 1.12  Categories of Workers (Autor, Levy, and Murnane, “Skill Content of Recent Technological Change”)

image

Figure 1.13  Growth of Nonroutine Work (Dvorkin, “Jobs Involving Routine Tasks”)

By 2016, the number of routine workers had not changed. The workforce included thirty million routine manual labors and thirty-three million routine cognitive workers. Given the growth in the overall workforce, these represent relative decreases. The number of nonroutine manual workers had increased to twenty-five million, roughly in line with the population growth. In other words, almost all real job growth occurred in cognitive nonroutine work. The number of workers in that category increased to sixty million.

As already noted, a vast increase in the corpus of human knowledge has occurred coincident with the growth in cognitive nonroutine work. Whether the topic is renewable energy, diabetes, fuel cells, or nonlinear optimization, no one person can master all that is known and relevant. In building that knowledge base, the tens of millions of cognitive workers also develop new analytic tools, create new perspectives and categories, and make new models of how the world works. No one person can keep up. Hence the rise in teams. By definition, those teams must be diverse.

The multistep logic bears elaboration: Our economy consists of a large number of cognitive workers who perform nonroutine tasks; within any task domain, they produce knowledge, tools, frameworks, and models. Any one person can master only a small slice. Therefore, even the most accomplished person relies on outside help. That outside help need not be smarter, but it must be diverse. It must possess different knowledge or skills to add value.

No Bonus Tasks

Understanding when diversity does not create bonuses can help us to see when it does. Only for nonroutine cognitive work should we expect large diversity bonuses. The other types of tasks will not produce significant bonuses. The lack of bonuses on manual routine tasks is the most straightforward. Recall the example of the pizza box folders. Assume that each employee can fold 800 boxes in an eight-hour shift. Together, the two should be able to fold 1,600 boxes. They produce no diversity bonus.

Organizational scholars refer to such tasks as additive because the contribution of the group equals the sum of the contributions of the group’s members. Selecting runners for a four-by-four-hundred-meter relay represents the canonical additive task. The best relay team consists of the four fastest runners. Measured ability, a runner’s time in the four hundred meters, correlates perfectly with her contribution to a relay team.

The same logic applies to any other additive task so diversity bonuses cannot be large. For teams who fell trees, shovel driveways, or sort clothing, we should not expect significant diversity bonuses because these tasks do not depend on deep thinking. These tasks offer few openings for diverse ideas. To expect a diversity bonus, to think that two people capable of felling eighty trees each per eight-hour shift could somehow fell two hundred trees together, is to engage in magical thinking (see figure 1.14).

The additive logic also applies to routine cognitive work: filling out claims forms, processing expense reports, working as a bank teller, or processing driver’s licenses at the secretary of state’s office. If one person can process ten forms, then two people can process twenty. We should not expect these tasks to produce significant diversity bonuses.

Nor should we expect bonuses for the manual, nonroutine tasks carried out by sales agents, janitors, security guards, tour guides, waiters, and nursing-home workers. While the situations people confront in these jobs vary from day to day, those tasks do not offer an opportunity for significant bonuses from diversity because of a lack of transferability. A waiter who identifies a better route to and from the kitchen acquires idiosyncratic knowledge, as does a health care worker who learns how to better motivate a patient to exercise. Neither gains knowledge that applies broadly.

Before closing the door on the possibility of diversity bonuses within these three categories of tasks, I need to address a few complications. First, semiroutine tasks that require expertise, such as painting houses, performing electrical work, or cooking, can produce modest diversity bonuses. Consider the process of making a meringue topping for a pie. A meringue consists of egg whites and sugar. The two combine to form protein chains that unfold and then reform around air bubbles. The sugar also performs a second function by stabilizing the protein-encased bubbles.

image

Figure 1.14  Magical Thinking for Additive Tasks

A cook learning to make a meringue acquires the following knowledge: The sugar cannot be too coarse or too fine—the finer the sugar, the faster it dissolves. A successful meringue also depends on properly aged eggs. As eggs age, they become more alkaline, loosening their protein bonds. Weaker bonds make for fluffier and less stable peaks. The strength of the bonds also varies with the eggs’ temperature. Colder eggs create stronger bonds. The temperature and the age of the eggs therefore interact. Older eggs should be cooler; fresher eggs should be warmer.

The cook also learns that if the egg whites touch fat, the meringue will not form. Even the slightest amount of fat-laden yolk contaminates the meringue, as will any fat residue lingering on the sides of the mixing bowl. Chefs therefore clean their bowls with lemon juice or vinegar before adding the egg whites. The technique for combining also matters. Beating the egg whites too vigorously breaks the protein chains and loses fluffiness. The proper mixing technique begins slowly and increases in speed.

All these details drive home the point that making a meringue requires mastery of a craft. Success depends on subtle interactions between ingredients, the absence of fat, and proper blending techniques. Similar types of knowledge and skill would be required in learning to paint a ceiling, build picnic tables, or make whiskey.

With experience, a person can improve at each of these tasks. We can capture those improvements on a learning curve showing quality improvements with repetition. Studies of manufacturing find that people improve at predictable rates.17 Broader inquires show that people improve in their abilities to apply any new technology.18 A portion of that improvement comes from repetition and refinement. Improvements also come from watching others and mimicking their behaviors. Only improvements that result from the latter can be considered diversity bonuses. So while bonuses can exist on these tasks, they are just not very large.

Second, many complicated tasks can be decomposed into specialized, additive tasks: an example would be building a table, which consists of sawing, gluing, sanding, and staining. These gains from specialization do not constitute a diversity bonus as defined here. The best table-building team will consist of specialists on each task. We could think of that as a gain from diversity, and it is. However, if we define each task separately, then those gains are not diversity bonuses. On each specific task, we should hire based on ability. Following a similar logic, we should not rush to assign diversity bonuses to basketball teams who rely on centers, guards, and forwards or to the workers at a hair and nail salon.

Third, some routine tasks require sophisticated coordination. On these tasks, homogeneity can be beneficial if it correlates with shared understandings. Diversity, rather than producing bonuses, can impose costs. Doubles tennis provides a near-ideal example of a task that involves elaborate coordination. Players must communicate nonverbally and be able to anticipate the actions of their partners. Sibling doubles teams dominate more than would occur by chance. The Williams sisters have dominated women’s doubles for more than a decade. In men’s tennis, identical twins Bob and Mike Bryan have held the number one ranking for more than eight years and won a record number of Grand Slam tournaments and championships. A similar logic might apply for event planners, real estate agents, or coaches. The benefits of familiarity and trust advantage homogenous teams.

These complications aside, my previous summary still holds: routine cognitive and noncognitive tasks tend to be additive. We should not expect diversity bonuses. On manual nonroutine tasks, improvements tend to be idiosyncratic and do not scale or transfer. When hiring people to perform any of these three types of tasks, organizations should focus on ability. The best team likely consists of the best individuals.

Tasks with Diversity Bonuses

Diversity bonuses, when they exist, do so on complex, high-dimensional tasks: solving a problem, predicting an outcome, designing a policy, evaluating a proposed merger, or undertaking research. Bonuses arise when these cognitive nonroutine tasks prove too complex for any one individual. No one person can possess sufficient knowledge, tools, or understandings to handle this type of task alone.

These tasks must also be difficult to separate into simpler components. When a task has these two attributes—high dimensionality and indecomposability—diversity bonuses become likely.

If that logic is correct, then we should see actors in industries that employ cognitive workers on nonroutine tasks to enact policies and procedures that attract, maintain, and develop diverse thinkers. And we do. Wall Street firms hire physicists, computer scientists, and psychologists in addition to financial analysts. Leading consulting companies hire PhDs in fields as diverse as chemistry, art history, and philosophy. The academy and think tanks all but trip over themselves in advocating that their researchers leave their academic silos and engage in interdisciplinary projects.

Designing an airplane provides an illustrative example that demonstrates why they seek diverse thinkers. Airplanes are large, complex objects. A Boeing 787 Dreamliner contains 2.3 million distinct parts and costs over $150 million. A 747–8 has 6 million parts and costs over $350 million. Over five thousand suppliers employing nearly a half million people produce parts for a 747–8, and they manufacture each part to exact specifications.

These many parts interact. Navigational and entertainment systems draw from a common electrical power source. Wing, tail, and engine designs affect aerodynamics in combination. Seat, lighting, and overhead storage designs all influence ergonomics. Every part adds to the weight and affects the fuel efficiency of the plane, so the people designing the interior of the plane must coordinate with the engineers designing the propulsion system.

Thus, unlike building a table, making an airplane cannot be decomposed into separate tasks performed in isolation. This indecomposability is desirable. It creates the possibility for diversity bonuses. On indecomposable tasks, people must share ideas and information to achieve a good outcome.19 To reduce the weight of the airplane, Boeing made the daring decision to build the 787 Dreamliner from composite materials, that is, plastic, rather than aluminum. That decision interacted with other decisions. Design teams tasked with creating a better flying experience long knew that people prefer more humidity inside the plane. In aluminum and steel planes, humidity levels were kept low to avoid corrosion. A composite plane cannot corrode, so the interior can be more humid. Of course, the plane must then carry more water to produce that humidity.

Algebra makes clear the potential magnitude of diversity bonuses on large-scale problems like building an airplane. Boeing’s total revenue in 2015 was approximately $100 billion. Revenues from the approximately fifty Dreamliners they sell each year constitute about 8 percent of that total. Boeing’s profit margin averages around 10 percent. Those Dreamliners cost around $7 billion to build. If the team designing the Dreamliner had a diversity bonus that reduced that cost by 1 percent, Boeing’s profits would increase by $70 million, or about $500 per employee.

A similar algebra of scale can be applied to Amazon, which ships upwards of four hundred boxes every second of every day. Amazon tasks teams of engineers, economists, packaging experts, ergonomists, and trend analysts with determining the sizes and shapes of those boxes. Amazon’s shipping costs exceed $10 billion per year. A diversity bonus resulting in a mere 3 percent improvement in shipping costs would correspond to over $300 million in savings.

Boeing’s and Amazon’s potential bonuses pale in comparison to the savings possible through improvements in energy usage and transmission. The United States produces about four trillion kilowatt hours of electricity each year, producing revenue of $400 billion. Between 5 and 6 percent of that energy, or between $20 billion and $25 billion, is lost in transmission and distribution. Reducing the loss to 4 percent would save billions.

DAILY BONUSES

Just as marginal improvements in building airplanes, organizing warehouses, or managing the electric power grid can translate into large economic effects, so too can the accumulation of smaller diversity bonuses produced by the tens of millions of cognitive workers engaged in nonroutine tasks.

To think that at every moment of every day, every team of cognitive nonroutine workers produces a diversity bonus overstates the likelihood of such bonuses. To quantify their impact, we must think in terms of tasks, not jobs. Any job consists of tasks with bonuses and tasks without them.

An emergency room doctor (a cognitive nonroutine worker) will not seek bonuses on every single task. During a workday, a doctor may gather information and decide on tests. She may make diagnoses and formulate treatment protocols. She may perform surgery.

Some of these tasks involve prediction; others involve problem solving. Some of her work consists of cognitive routine tasks (diagnosing strep throat). Other parts involve routine manual tasks (taping a sprained ankle). The doctor should only seek out diversity bonuses on the most challenging of the tasks. She should not bother seeking out diverse opinions on a routine ankle injury, but she should seek advice when treating an admitted patient presenting nontraditional or contradictory symptoms.

Alternative diagnoses can create diversity bonuses if another doctor possesses a different knowledge base or different insights and can offer a more effective protocol. In medicine, the tasks with diversity bonuses are often the most consequential. A poorly taped ankle is a misfortune; a misdiagnosis of bubonic plague is a tragedy.

Medical practice benefits from diversity bonuses within and across specialties. The Vermont Oxford Network serves as an illustrative example. Founded in 1989 as a consortium of thirty-four neonatal intensive care units (NICUs), the network now includes over one thousand NICUs. The network assists in the formation of interdisciplinary teams to improve outcomes. Those teams share best practices and quality improvements through intensive training and online collaborations. The network also organizes randomized controlled trials across hospitals to compare treatments. A trial might cover a topic such as respiratory distress or temperature regulation.

The members of the Vermont Oxford Network share a common goal to lower the neonatal mortality rate. As was the case with the Netflix Prize, their success comes not in the form of a singular brilliant answer but in the accumulation of many small improvements discovered by diverse people.

The network enables large-scale testing of promising ideas. Not all proposed ideas work. Occlusion wraps (encasing a baby’s body from the neck down in a plastic bubble) were thought to be a potential way to regulate temperature in newborns. Coordinated empirical tests revealed they did not produce significant effects.20 Other ideas have worked. The sustained efforts of the Vermont Oxford Network have saved lives. Over the past twenty-five years, the neonatal mortality rate has fallen from 5 percent to less than 3 percent in England and from over 6 percent to around 4 percent in the United States.

In medicine, diversity bonuses come in the form of better solutions to problems and more accurate predictions. In other domains, the bonuses may be in the form of improved designs, technologies, or models. The logic will not be one size fits all.

The diversity bonuses described so far apply to an existing challenge or opportunity. A person who notices or emphasizes a novel dimension or attribute might also create something entirely new. In these cases, rather than a diverse repertoire improving performance on an existing task, it originates a new product, technology, and even an entire market—later, I will describe cases of entrepreneurs who saw opportunities that others missed because of their experiences and identities. I refer to these as originating diversity bonuses. Where people see opportunity will be a function of what they see and where they look.

DIVERSITY IS THE WORD

The Netflix Prize competition showed how people with diverse cognitive tools, understandings, and experiences could better predict movie preferences. The Vermont Oxford Network tapped into diverse knowledge and tested diverse ideas to save the lives of newborns. Within the field of collective intelligence, one can find thousands of other examples of teams who produced bonuses.21 These include the scientists at England’s Bletchley Park who cracked the Germans’ Enigma machine, the contributors in Edison’s laboratory, and the teams of scientists, composed of men and women of all races, who got us to the moon.

We must keep in mind that the evidence we have of diversity bonuses understates the potential contribution of diversity because the evidence comes from the world as it is, not the world as it could be. A more inclusive society would produce larger bonuses.

Some of the most compelling evidence comes from the growth of teams in scientific research, a domain teeming with hard problems. A half century ago, single-author papers outnumbered papers by teams of four or more by a ratio greater than four to one. Today, that ratio has reversed.22 This trend can be explained by the fact that team-written papers land in better journals and have more influence. Deeper statistical studies attribute these papers’ successes to a mixture of depth and diversity.23

Organizations show a similar trend toward team-based work.24 In the 1990s individuals managed over two-thirds of equity funds. By 2016, more than three-fourths were run by teams of two to five managers. As with the scientists, the investment teams performed better. And, as with the Netflix Prize, their advantage could be quantified. In this case, the accounting was done in dollars.25

In the Netflix Prize competition, diverse algorithms, representations, and models created bonuses. The weighting algorithms and models were not chosen randomly from the World Wide Web. Their choices of what to try were based on evidence and guided by theory. Later, I present some of that theory expressed in mathematics. One of those mathematical equations implies that the average of two equally accurate diverse predictive models must be more accurate than either one. The result is not that most of the time the average will be more accurate, but that it will always be more accurate.

Knowing that mathematical fact, the members of the Ensemble averaged their model’s final predictions with those of BellKor’s Pragmatic Chaos. Keep in mind that each model had improved on Cinematch by the same 10.06 percent. In the competition’s final days, either team would have been thrilled to identify a 0.005 percent improvement. When the Ensemble averaged the two models, accuracy increased by thirty times that amount.

In Invisible Man, Ralph Ellison wrote, “Whence all this passion towards conformity anyway? Diversity is the word.”26 His words ring even truer in today’s complex world, where we apply our brains more than our shoulders. Diversity is the word.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset