31 January 2021

Range: Why Generalists Triumph in a Specialized World – David Epstein

INTRODUCTION: Roger vs. Tiger

He found that the sport really didn’t matter much, so long as it included a ball. “I was always very much more interested if a ball was involved”

The boy “became unbearable,” his mother said, if he had to stay still for too long.

Though his mother taught tennis, she decided against working with him. “He would have just upset me anyway,” she said. “He tried out every strange stroke and certainly never returned a ball normally. That is simply no fun for a mother.”

As a teenager, he was good enough to warrant an interview with the local newspaper. His mother was appalled to read that, when asked what he would buy with a hypothetical first paycheck from playing tennis, her son answered, “a Mercedes.” She was relieved when the reporter let her listen to a recording of the interview and they realized there’d been a mistake: the boy had said “Mehr CDs,” in Swiss German. He simply wanted “more CDs.” The boy was competitive, no doubt. But when his tennis instructors decided to move him up to a group with older players, he asked to move back so he could stay with his friends.

Tiger was not merely playing golf. He was engaging in “deliberate practice,” the only kind that counts in the now-ubiquitous ten-thousand-hours rule to expertise.

Elite athletes at the peak of their abilities do spend more time on focused, deliberate practice than their near-elite peers.

Eventual elites typically devote less time early on to deliberate practice in the activity in which they will eventually become experts. Instead, they undergo what researchers call a “sampling period.” They play a variety of sports, usually in an unstructured or lightly structured environment; they gain a range of physical proficiencies from which they can draw; they learn about their own abilities and proclivities; and only later do they focus in and ramp up technical practice in one area.

One study showed that early career specializers jumped out to an earnings lead after college, but that later specializers made up for the head start by finding work that better fit their skills and personalities. I found a raft of studies that showed how technological inventors increased their creative impact by accumulating experience in different domains, compared to peers who drilled more deeply into one; they actually benefited by proactively sacrificing a modicum of depth for breadth as their careers progressed.

Researchers at Northwestern, MIT, and the U.S. Census Bureau studied new tech companies and showed that among the fastest-growing start-ups, the average age of a founder was forty-five when the company was launched.

The research pertains to every stage of life, from the development of children in math, music, and sports, to students fresh out of college trying to find their way, to midcareer professionals in need of a change and would-be retirees looking for a new vocation after moving on from their previous one. The challenge we all face is how to maintain the benefits of breadth, diverse experience, interdisciplinary thinking, and delayed concentration in a world that increasingly incentivizes, even demands, hyperspecialization.

CHAPTER 1: The Cult of the Head Start

Whether or not experience inevitably led to expertise, they agreed, depended entirely on the domain in question. Narrow experience made for better chess and poker players and firefighters, but not for better predictors of financial or political trends, or of how employees or patients would perform.

Patterns repeat over and over, and feedback is extremely accurate and usually very rapid. In golf or chess, a ball or piece is moved according to rules and within defined boundaries, a consequence is quickly apparent, and similar challenges occur repeatedly.

In “wicked” [or “non-kind”, see this page] domains, the rules of the game are often unclear or incomplete, there may or may not be repetitive patterns and they may not be obvious, and feedback is often delayed, inaccurate, or both.

The reason that elite athletes seem to have superhuman reflexes is that they recognize patterns of ball or body movements that tell them what’s coming before it happens. When tested outside of their sport context, their superhuman reactions disappear.

Studying an enormous number of repetitive patterns is so important in chess that early specialization in technical practice is critical.

Our greatest strength is the exact opposite of narrow specialization. It is the ability to integrate broadly.

The progress of AI in the closed and orderly world of chess, with instant feedback and bottomless data, has been exponential. In the rule-bound but messier world of driving, AI has made tremendous progress, but challenges remain. In a truly open-world problem devoid of rigid rules and reams of perfect historical data, AI has been disastrous.

When we know the rules and answers, and they don’t change over time—chess, golf, playing classical music — an argument can be made for savant-like hyperspecialized practice from day one. But those are poor models of most things humans want to learn.

When narrow specialization is combined with an unkind domain, the human tendency to rely on experience of familiar patterns can backfire horribly — like the expert firefighters who suddenly make poor choices when faced with a fire in an unfamiliar structure.

Compared to golf, a sport like tennis is much more dynamic, with players adjusting to opponents every second, to surfaces, and sometimes to their own teammates.

But tennis is still very much on the kind end of the spectrum compared to, say, a hospital emergency room, where doctors and nurses do not automatically find out what happens to a patient after their encounter. They have to find ways to learn beyond practice, and to assimilate lessons that might even contradict their direct experience.

There are domains beyond chess in which massive amounts of narrow practice make for grandmaster-like intuition. Like golfers, surgeons improve with repetition of the same procedure. Accountants and bridge and poker players develop accurate intuition through repetitive experience.

But when the rules are altered just slightly, it makes experts appear to have traded flexibility for narrow skill.

His suggestions for avoiding it are about the polar opposite of the strict version of the ten-thousand-hours school of thought: vary challenges within a domain drastically, and, as a fellow researcher put it, insist on “having one foot outside your world.”

Compared to other scientists, Nobel laureates are at least twenty-two times more likely to partake as an amateur actor, dancer, magician, or other type of performer.

CHAPTER 2: How the Wicked World Was Made

Exposure to the modern world has made us better adapted for complexity, and that has manifested as flexibility, with profound implications for the breadth of our intellectual world. In every cognitive direction, the minds of premodern citizens were severely constrained by the concrete world before them.

A class at the University of Washington titled “Calling Bullshit” (in staid coursebook language: INFO 198/BIOL 106B), focused on broad principles fundamental to understanding the interdisciplinary world and critically evaluating the daily firehose of information.

“How many piano tuners are there in New York City?” Students had to estimate, just by reasoning, and try to get the right order of magnitude. The professor later explained that these were “Fermi problems,” because Enrico Fermi — who created the first nuclear reactor beneath the University of Chicago football field — constantly made back-of-the-envelope estimates to help him approach problems.

Unsurprisingly, Fermi problems were a topic in the “Calling Bullshit” course. It used a deceptive cable news report as a case study to demonstrate “ how Fermi estimation can cut through bullshit like a hot knife through butter.” It gives anyone consuming numbers, from news articles to advertisements, the ability quickly to sniff out deceptive stats.

They were perfectly capable of learning from experience, but failed at learning without experience. And that is what a rapidly changing, wicked world demands — conceptual reasoning skills that can connect new ideas and work across contexts.

The ability to apply knowledge broadly comes from broad training.

CHAPTER 3: When Less of the Same Is More

The sampling period is not incidental to the development of great performers — something to be excised in the interest of a head start — it is integral.

The students who would go on to be most successful only started practicing much more once they identified an instrument they wanted to focus on, whether because they were better at it or just liked it more. The instrument, it appeared, was driving the practitioner, rather than the reverse.

The psychologists highlighted the variety of paths to excellence, but the most common was a sampling period, often lightly structured with some lessons and a breadth of instruments and activities, followed only later by a narrowing of focus, increased structure, and an explosion of practice volume.

the more contexts in which something is learned, the more the learner creates abstract models, and the less they rely on any particular example. Learners become better at applying their knowledge to a situation they’ve never seen before, which is the essence of creativity.

CHAPTER 4: Learning, Fast and Slow

The study conclusion was simple: “training with hints did not produce any lasting learning.” Training without hints is slow and error-ridden. It is, essentially, what we normally think of as testing, except for the purpose of learning rather than evaluation – when “test” becomes a dreaded four-letter word.

If that eighth-grade classroom followed a typical academic plan over the course of the year, it is precisely the opposite of what science recommends for durable learning – one topic was probably confined to one week and another to the next. Like a lot of professional development efforts, each particular concept or skill gets a short period of intense focus, and then on to the next thing, never to return. That structure makes intuitive sense, but it forgoes another important desirable difficulty: “spacing,” or distributed practice.

Repetition, it turned out, was less important than struggle. It isn’t bad to get an answer right while studying. Progress just should not happen too quickly.

If you are doing too well when you test yourself, the simple antidote is to wait longer before practicing the same material again, so that the test will be more difficult when you do. Frustration is not a sign you are not learning, but ease is.

In 2007, the U.S. Department of Education published a report by six scientists and an accomplished teacher who were asked to identify learning strategies that truly have scientific backing. Spacing, testing, and using making-connections questions were on the extremely short list. All three impair performance in the short term.

Teachers who guided students to overachievement in their own course were rated highly, and undermined student performance in the long run.

(...) what psychologists call “blocked” practice. That is, practicing the same thing repeatedly, each problem employing the same procedure. It leads to excellent immediate performance, but for knowledge to be flexible, it should be learned under varied conditions, an approach called varied or mixed practice, or, to researchers, “interleaving.”

In a study using college math problems, students who learned in blocks – all examples of a particular type of problem at once – performed a lot worse come test time than students who studied the exact same problems but all mixed up. The blocked-practice students learned procedures for each type of problem through repetition. The mixed-practice students learned how to differentiate types of problems. The same effect has appeared among learners studying everything from butterfly species identification to psychological-disorder diagnosis.

Desirable difficulties like testing and spacing make knowledge stick. It becomes durable. Desirable difficulties like making connections and interleaving make knowledge flexible, useful for problems that never appeared in training. All slow down learning and make performance suffer, in the short term.

(...) reviewed sixty-seven early childhood education programs meant to boost academic achievement. Programs like Head Start did give a head start, but academically that was about it. The researchers found a pervasive “fadeout” effect, where a temporary academic advantage quickly diminished and often completely vanished. On a graph, it looks eerily like the kind that show future elite athletes catching up to their peers who got a head start in deliberate practice. A reason for this, the researchers concluded, is that early childhood education programs teach “closed” skills that can be acquired quickly with repetition of procedures, but that everyone will pick up at some point anyway. The fadeout was not a disappearance of skill so much as the rest of the world catching up. The motor-skill equivalent would be teaching a kid to walk a little early. Everyone is going to learn it anyway, and while it might be temporarily impressive, there is no evidence that rushing it matters. The research team recommended that if programs want to impart lasting academic benefits they should focus instead on “open” skills that scaffold later knowledge. Teaching kids to read a little early is not a lasting advantage. Teaching them how to hunt for and connect contextual clues to understand what they read can be. As with all desirable difficulties, the trouble is that a head start comes fast, but deep learning is slow.

CHAPTER 5: Thinking Outside Experience

Deep analogical thinking is the practice of recognizing conceptual similarities in multiple domains or scenarios that may seem to have little in common on the surface. It is a powerful tool for solving wicked problems, and Kepler was an analogy addict.

In the course of studying problem solving in the 1930s, Karl Duncker posed one of the most famous hypothetical problems in all of cognitive psychology. It goes like this: Suppose you are a doctor faced with a patient who has a malignant stomach tumor. It is impossible to operate on this patient, but unless the tumor is destroyed the patient will die. There is a kind of ray that can be used to destroy the tumor. If the rays reach the tumor all at once at a sufficiently high intensity, the tumor will be destroyed. Unfortunately, at this intensity the healthy tissue that the rays pass through on the way to the tumor will also be destroyed. At lower intensities the rays are harmless to healthy tissue, but they will not affect the tumor either. What type of procedure might be used to destroy the tumor with the rays, and at the same time avoid destroying the healthy tissue? It’s on you to excise the tumor and save the patient, but the rays are either too powerful or too weak. How can you solve this? While you’re thinking, a little story to pass the time: There once was a general who needed to capture a fortress in the middle of a country from a brutal dictator. If the general could get all of his troops to the fortress at the same time, they would have no problem taking it. Plenty of roads that the troops could travel radiated out from the fort like wheel spokes, but they were strewn with mines, so only small groups of soldiers could safely traverse any one road. The general came up with a plan. He divided the army into small groups, and each group traveled a different road leading to the fortress. They synchronized their watches, and made sure to converge on the fortress at the same time via their separate roads. The plan worked. The general captured the fortress and overthrew the dictator. Have you saved the patient yet? Just one last story while you’re still thinking: Years ago, a small-town fire chief arrived at a woodshed fire, concerned that it would spread to a nearby house if it was not extinguished quickly. There was no hydrant nearby, but the shed was next to a lake, so there was plenty of water. Dozens of neighbors were already taking turns with buckets throwing water on the shed, but they weren’t making any progress. The neighbors were surprised when the fire chief yelled at them to stop, and to all go fill their buckets in the lake. When they returned, the chief arranged them in a circle around the shed, and on the count of three had them all throw their water at once. The fire was immediately dampened, and soon thereafter extinguished. The town gave the fire chief a pay raise as a reward for quick thinking. Are you done saving your patient? Don’t feel bad, almost no one solves it. At least not at first, and then nearly everyone solves it.
The answer is that you (the doctor) could direct multiple low-intensity rays at the tumor from different directions, leaving healthy tissue intact, but converging at the tumor site with enough collective intensity to destroy it.

The important part is what it shows about problem solving. A gift of a single analogy from a different domain tripled the proportion of solvers who got the radiation problem. Two analogies from disparate domains gave an even bigger boost.

An experiment on Stanford international relations students during the Cold War provided a cautionary tale about relying on kind-world reasoning – that is, drawing only on the first analogy that feels familiar. The students were told that a small, fictional democratic country was under threat from a totalitarian neighbor, and they had to decide how the United States should respond. Some students were given descriptions that likened the situation to World War II (refugees in boxcars; a president “from New York, the same state as FDR”; a meeting in “Winston Churchill Hall”). For others, it was likened to Vietnam, (a president “from Texas, the same state as LBJ,” and refugees in boats). The international relations students who were reminded of World War II were far more likely to choose to go to war; the students reminded of Vietnam opted for nonmilitary diplomacy. That phenomenon has been documented all over the place. College football coaches rated the same player’s potential very differently depending on what former player he was likened to in an introductory description, even with all other information kept exactly the same.

In a wicked world, relying upon experience from a single domain is not only limiting, it can be disastrous.

The outside view is deeply counterintuitive because it requires a decision maker to ignore unique surface features of the current project, on which they are the expert, and instead look outside for structurally similar analogies.

Netflix came to a similar conclusion for improving its recommendation algorithm. Decoding movies’ traits to figure out what you like was very complex and less accurate than simply analogizing you to many other customers with similar viewing histories. Instead of predicting what you might like, they examine who you are like, and the complexity is captured therein.

The students prompted with one analogy came up with more strategies than those given no analogies, and students given multiple analogies came up with more strategies than those reminded only of one.

And the more distant the analogy, the better it was for idea generation. Students who were pointed to Nike and McDonald’s generated more strategic options than their peers who were reminded of computer companies Apple and Dell. Just being reminded to analogize widely made the business students more creative.

(...) their intuition was to use too few analogies, and to rely on those that were the most superficially similar.

The labs in which scientists had more diverse professional backgrounds were the ones where more and more varied analogies were offered, and where breakthroughs were more reliably produced when the unexpected arose.

For relatively straightforward challenges, labs started with analogies to other, very similar experiments. The more unusual the challenge, the more distant the analogies, moving away from surface similarities and toward deep structural similarities.

CHAPTER 6: The Trouble with Too Much Grit

Steven Levitt, the economist who coauthored Freakonomics, cleverly leveraged his readership for a test of switching. On the “Freakonomics Experiments” home page, he invited readers who were considering life changes to flip a digital coin. Heads meant they should go ahead and make the change, tails that they should not.
The answer for the potential job changers who flipped heads was: only if they wanted to be happier. Six months later, those who flipped heads and switched jobs were substantially happier than the stayers.

“We fail,” he wrote, when we stick with “ tasks we don’t have the guts to quit.” Godin clearly did not advocate quitting simply because a pursuit is difficult. Persevering through difficulty is a competitive advantage for any traveler of a long road, but he suggested that knowing when to quit is such a big strategic advantage that every single person, before undertaking an endeavor, should enumerate conditions under which they should quit. The important trick, he said, is staying attuned to whether switching is simply a failure of perseverance, or astute recognition that better matches are available.

Growing self-knowledge kept changing my goals and interests until I landed in a career the very lifeblood of which is investigating broad interests.

I don’t think I have become less passionate or resilient over time, nor do I think that all those former West Point cadets who left the Army lost the drive that got them there in the first place. It makes sense to me that grit would be powerfully predictive for cadets trying to get through their rigorous orientation, or for a sample of schoolchildren or spelling bee contestants. Very young people often have their goals set for them, or at least have a limited menu to choose from, and pursuing them with passion and resilience is the main challenge.

In the wider world of work, finding a goal with high match quality in the first place is the greater challenge, and persistence for the sake of persistence can get in the way.

CHAPTER 7: Flirting with Your Possible Selves

Even people who look like consummate long-term visionaries from afar usually looked like short-term planners up close.

The crystal ball allure of the marshmallow test is undeniable, and also misconstrued. Mischel’s collaborator Yuichi Shoda has repeatedly made a point of saying that plenty of preschoolers who ate the marshmallow turned out just fine.* Shoda maintained that the most exciting aspect of the studies was demonstrating how easily children could learn to change a specific behavior with simple mental strategies, like thinking about the marshmallow as a cloud rather than food.

At a given point in life, an individual’s nature influences how they respond to a particular situation, but their nature can appear surprisingly different in some other situation. With Mischel, he began to study “ if-then signatures.” If David is at a giant party, then he seems introverted, but if David is with his team at work, then he seems extroverted. (True.) So is David introverted or extroverted? Well, both, and consistently so. Ogas and Rose call this the “context principle.” In 2007, Mischel wrote, “The gist of such findings is that the child who is aggressive at home may be less aggressive than most when in school; the man exceptionally hostile when rejected in love may be unusually tolerant about criticism of his work; the one who melts with anxiety in the doctor’s office may be a calm mountain climber; the risk-taking entrepreneur may take few social risks.”

Instead of asking whether someone is gritty, we should ask when they are.

We maximize match quality throughout life by sampling activities, social groups, contexts, jobs, careers, and then reflecting and adjusting our personal narratives. And repeat.

“All of the strengths-finder stuff, it gives people license to pigeonhole themselves or others in ways that just don’t take into account how much we grow and evolve and blossom and discover new things,” Ibarra told me. “But people want answers, so these frameworks sell. It’s a lot harder to say, ‘Well, come up with some experiments and see what happens.’”

Rather than a grand plan, find experiments that can be undertaken quickly. “Test-and-learn,” Ibarra told me, “not plan-and-implement.”

CHAPTER 8: The Outsider Advantage

Our intuition might be that only hyperspecialized experts can drive modern innovation, but increasing specialization actually creates new opportunities for outsiders. As Alph Bingham noticed, for difficult challenges organizations tend toward local search. They rely on specialists in a single knowledge domain, and methods that have worked before.

If those fail, they’re stuck. For the most intractable problems, “ our research shows that a domain-based solution is often inferior,” according to Lakhani. “Big innovation most often happens when an outsider who may be far away from the surface of the problem reframes the problem in a way that unlocks the solution.”

“The people who win a Kaggle health competition have no medical training, no biology training, and they’re also often not real machine learning experts.”

(...) connecting information from scientific articles in subspecialty domains that never cited one another and that had no scientists who worked together.

CHAPTER 9: Lateral Thinking with Withered Technology

Yokoi embarked on an approach he called “ lateral thinking with withered technology.” Lateral thinking is a term coined in the 1960s for the reimagining of information in new contexts, including the drawing together of seemingly disparate concepts or domains that can give old ideas new uses. By “withered technology,” Yokoi meant tech that was old enough to be extremely well understood and easily available, so it didn’t require a specialist’s knowledge. The heart of his philosophy was putting cheap, simple technology to use in ways no one else considered. If he could not think more deeply about new technologies, he decided, he would think more broadly about old ones. He intentionally retreated from the cutting edge, and set to monozukuri.

(...) The Game Boy had to compete with handheld consoles from Sega and Atari that were technologically superior in every way. And it destroyed them. What its withered technology lacked, the Game Boy made up in user experience. It was cheap. It could fit in a large pocket. It was all but indestructible. (...) Unlike its power-guzzling color competitors, it played for days (or weeks) on AA batteries. Old hardware was extremely familiar to developers inside and outside Nintendo, and with their creativity and speed unencumbered by learning new technology, they pumped out games as if they were early ancestors of iPhone app designers.

Yokoi was the first to admit it. “I don’t have any particular specialist skills,” he once said. “I have a sort of vague knowledge of everything.” He advised young employees not just to play with technology for its own sake, but to play with ideas. Do not be an engineer, he said, be a producer. “The producer knows that there’s such a thing as a semiconductor, but doesn’t need to know its inner workings... That can be left to the experts.”

(...) unearthed one more type of inventor. They called them “polymaths,” broad with at least one area of depth.

She is a “T-shaped person,” she said, one who has breadth, compared to an “I-shaped person,” who only goes deep. (...) “T-people like myself can happily go to the I-people with questions to create the trunk for the T,” she told me. “My inclination is to attack a problem by building a narrative. I figure out the fundamental questions to ask, and if you ask those questions of the people who actually do know their stuff, you are still exactly where you would be if you had all this other knowledge inherently. It’s mosaic building. I just keep putting those tiles together.

If not experience, repetition, or resources, what helped creators make better comics on average and innovate? The answer (in addition to not being overworked) was how many of twenty-two different genres a creator had worked in, from comedy and crime, to fantasy, adult, nonfiction, and sci-fi. Where length of experience did not differentiate creators, breadth of experience did.

They titled their study Superman or the Fantastic Four? “When seeking innovation in knowledge-based industries,” they wrote, “it is best to find one ‘super’ individual. If no individual with the necessary combination of diverse knowledge is available, one should form a ‘fantastic’ team.” Diverse experience was impactful when created by platoon in teams, and even more impactful when contained within an individual.

In kind environments, where the goal is to re-create prior performance with as little deviation as possible, teams of specialists work superbly. Surgical teams work faster and make fewer mistakes as they repeat specific procedures, and specialized surgeons get better outcomes even independent of repetitions.

The same goes for airline crews. Teams that have experience working together become exceedingly efficient at delegating all of the well-understood tasks required to ensure a smooth flight. When the National Transportation Safety Board analyzed its database of major flight accidents, it found that 73 percent occurred on a flight crew’s first day working together.

HR policies at mature companies have such well-defined, specialized slots for employees that potential serial innovators will look like “round pegs to the square holes” and get screened out. Their breadth of interests do not neatly fit a rubric. They are “π-shaped people” who dive in and out of multiple specialties.

CHAPTER 10: Fooled by Expertise

When a new idea entered the conversation, he was quick with “Let’s say for the sake of argument,” which led to him playing out viewpoints from different disciplines or political or emotional perspectives. He tried on ideas like Instagram filters until it was hard to tell which he actually believed.

In contrast to politicians, the most adept predictors flip-flop like crazy.

In separate work, from 2000 to 2010 German psychologist Gerd Gigerenzer compiled annual dollar-euro exchange rate predictions made by twenty-two of the most prestigious international banks – Barclays, Citigroup, JPMorgan Chase, Bank of America Merrill Lynch, and others. Each year, every bank predicted the end-of-year exchange rate. Gigerenzer’s simple conclusion about those projections, from some of the world’s most prominent specialists: “Forecasts of dollar-euro exchange rates are worthless.” In six of the ten years, the true exchange rate fell outside the entire range of all twenty-two bank forecasts.

The best forecasters view their own ideas as hypotheses in need of testing. Their aim is not to convince their teammates of their own expertise, but to encourage their teammates to help them falsify their own notions.

CHAPTER 11: Learning to Drop Your Familiar Tools

The most famous business school case studies ever created, known as Carter Racing. The crux is whether the fictional Carter Racing team’s car should compete in the biggest race of the season, which begins in one hour. The argument in favor of racing: thanks to a custom turbocharger, Carter Racing has placed in the money (top five) in twelve of twenty-four races. That success secured an oil company sponsorship, and a trial sponsorship from prestigious (and also fictional) Goodstone Tire. Carter Racing won the last race, its fourth win of the season. Today’s race will be on national TV, and if Carter Racing finishes in the top five, it will likely draw a $2 million sponsorship from Goodstone. If Carter Racing chooses not to race and withdraws, it would lose part of its entry fee and have to pay back some sponsor money. The team would end a stellar season $80,000 in the hole, and may never get another shot this big. Racing seems like a no-brainer. The argument against racing: in seven of twenty-four races, the engine failed, each time damaging the car. In the last two races, the mechanics used a new engine-prep procedure and had no trouble, but they aren’t sure what caused the problem before. If the engine fails on national TV, the team will lose the oil sponsorship, kiss Goodstone goodbye, and go back to square one, or perhaps out of business. So: race, or don’t race?
(...) The case study says that at the last minute, the team owner, BJ Carter, called his mechanics. Pat, the engine mechanic, dropped out of high school and has no sophisticated engineering training, but he has a decade of race experience. Temperature could be the issue, he suggested. When the turbocharger warms up on a cool day, engine components might expand at different rates and set up failure of the head gasket, a metal seal in the engine. Pat admitted that each engine failure looked different, but all seven had breaks in the head gasket. (Two of the engine failures had multiple breaks in the gasket.) He didn’t know what was going on, but couldn’t think of anything else on short notice. He was still hyped to race, and jubilant about the new Goodstone uniforms. At 40 degrees, it is the coldest race day of the season. Robin, the chief mechanic, endorsed Pat’s idea to look at the temperature data. He plotted it on a graph, but saw no correlation:

Chart 1

(...) most student groups around the world who have ever been assigned the Carter Racing case chose to race.
(...) Teams that decided to race discuss their probability estimates and decision trees. Students are split on whether mid-race engine failure will endanger the driver. A majority of students think the temperature data is a red herring. (...) He adds that the data are particularly ambiguous because for some reason the chief mechanic didn’t plot the race temperatures when the engine didn’t fail. “Okay, so, Dmitry, here comes a quantitative question,” the professor says. “How many times did I say yesterday if you want additional information let me know?” Muffled gasps spread across the room. “Four times,” the professor answers himself. (...) Not one student asked for the missing data.

Chart 2

Every single race below 65 degrees had an engine failure. The professor then labels every race either a fail or not fail, and with that binary division runs a simple statistical analysis, familiar to the students, known as a logistic regression. He informs the students that there is a 99.4 percent probability of engine failure at 40 degrees.

The temperature and engine failure data are taken exactly from NASA’s tragic decision to launch the space shuttle Challenger, with the details placed in the context of racing rather than space exploration. (...) Challenger had failed O-rings – the rubber strips that sealed joints along the outer wall of the missile-like rocket boosters that propelled the shuttle. Cool temperatures caused O-ring rubber to harden, making them less effective seals.

It’s often the case in group meetings where the person who made the PowerPoint slides puts data in front of you, and we often just use the data people put in front of us.

And now for one last surprise. They all got it wrong. The Challenger decision was not a failure of quantitative analysis. NASA’s real mistake was to rely on quantitative analysis too much. (...) temperature usually did not matter; the O-rings were protected by a special insulating putty meant to block burning gas from reaching them in the first place. On the seventeen flights with no O-ring problems – akin to the seventeen Carter Racing races with no engine problems – the putty worked perfectly. Those flights provided no information whatsoever about how O-rings might fail, no matter the temperature, because the burning gas could not even get to the O-rings to cause a problem. Sometimes, however, small holes formed in the putty when the joints were assembled. (...) And on those seven shuttle flights – unlike gasket breaks in Carter Racing, which was the same problem every time – the O-ring issues came in two different varieties. The first: erosion. On five flights, burning gas that came shooting down the booster at ignition hit the O-rings and eroded the rubber surface. This was not a life-or-death condition. There was more than enough rubber for the O-ring to do its job. And erosion had nothing at all to do with temperature. The second variety: blow-by. If the rubber ring did not expand instantly to fully seal the joint at ignition, burning gas “blew by” and could potentially shoot right through the booster wall. Blow-by was a life-or-death condition.

Thiokol engineers who opposed the launch on the emergency prelaunch conference call did not really have twenty-four relevant data points on O-ring failure to work with, as the Carter Racing study indicates. They did not even have seven, like the Harvard students. They had two. Now what does the chart tell you?

Chart 3

There was other important information the Thiokol engineers presented that could have helped NASA avert disaster. But it was not quantitative, so NASA managers did not accept it. The Carter Racing study teaches that the answer was available, if only engineers looked at the right numbers. In reality, the right numbers did not contain an answer at all. The Challenger decision was truly ambiguous. It was a wicked problem, rife with uncertainty, and outside of previous experience, where demanding more data actually became the problem itself.

The infamous emergency conference call convened thirty-four engineers – every manager was also an engineer – in three locations. (...) In Boisjoly’s opinion, the reason the 53-degree launch looked so much worse was that cool conditions had hardened the O-rings and made them slow to expand and seal at ignition. He was right, but he did not have the data to prove it. (...) A last-minute 53-degree limit was setting an entirely new technical criteria for launches. It had never been discussed, was not backed by quantitative data, and meant that suddenly winter was off-limits for space exploration.

(...) The crew would have made it out intact had they simply dropped their tools and run from the start. In four separate fires in the 1990s, twenty-three elite wildland firefighters refused orders to drop their tools and perished beside them. Even when Rhoades eventually dropped his chainsaw, he felt like he was doing something unnatural. (...) For him, firefighters were an example, and a metaphor for what he learned while studying normally reliable organizations that clung to trusty methods, even when they led to bewildering decisions.

(...) experienced groups became rigid under pressure and “regress to what they know best.” They behaved like a collective hedgehog, bending an unfamiliar situation to a familiar comfort zone, as if trying to will it to become something they actually had experienced before. (...) As Maclean succinctly put it, “When a firefighter is told to drop his firefighting tools, he is told to forget he is a firefighter.” (...) At NASA, accepting a qualitative argument was like being told to forget you are an engineer. (...) Research on aviation accidents, for example, found that “ a common pattern was the crew’s decision to continue with their original plan” even when conditions changed dramatically.

A level of ambiguity, it seemed, was not harmful. In decision making, it can broaden an organization’s toolbox in a way that is uniquely valuable.

The experiments showed that an effective problem-solving culture was one that balanced standard practice – whatever it happened to be – with forces that pushed in the opposite direction. If managers were used to process conformity, encouraging individualism helped them to employ “ambidextrous thought,” and learn what worked in each situation. If they were used to improvising, encouraging a sense of loyalty and cohesion did the job. The trick was expanding the organization’s range by identifying the dominant culture and then diversifying it by pushing in the opposite direction.

NASA still had its hallowed process, and Geveden saw everywhere a collective culture that nudged conflict into darkened corners. “You almost couldn’t go into a meeting without someone saying, ‘Let’s take that offline,’” he recalled, just as Morton Thiokol had done for the infamous offline caucus. (...) He emphasized that there is a difference between the chain of command and the chain of communication, and that the difference represents a healthy cross-pressure. “I warned them, I’m going to communicate with all levels of the organization down to the shop floor, and you can’t feel suspicious or paranoid about that,” he said. “I told them I will not intercept your decisions that belong in your chain of command, but I will give and receive information anywhere in the organization, at any time. I just can’t get enough understanding of the organization from listening to the voices at the top.”

Similarly harrowing findings are now appearing all over medicine, wherever specialties have arisen for the use of a particular tool. One of the most common orthopedic surgeries in the world involves shaving a torn meniscus – a piece of cartilage in the knee – back to its original crescent shape. A patient reports knee pain; an MRI shows a torn meniscus; naturally, a surgeon wants to fix it. When five orthopedic clinics in Finland compared the surgery with “sham surgery” – that is, surgeons took patients with knee pain and a torn meniscus to operating rooms, made incisions, faked surgeries, and sewed them back up and sent them to physical therapy – they found that sham surgery worked just as well. Most people with a torn meniscus, it turns out, don’t have any symptoms at all and will never even know. And for those who do have a torn meniscus and knee pain, the tear may have nothing to do with the pain.

CHAPTER 12: Deliberate Amateurs

He was lucky. He arrived in a workspace that treated mental meandering as a competitive advantage, not a pest to be exterminated in the name of efficiency. That kind of protection from the cult of the head start is increasingly rare. At some point or other, we all specialize to one degree or another, so the rush to get there can seem logical. Fortunately, there are pioneers who are working to balance the cult of the head start.

“I would say keep the same time, and deemphasize all the other didactic material,” Casadevall said. “Do we really need to go through courses with very specialized knowledge that often provides a huge amount of stuff that is very detailed, very specialized, very arcane, and will be totally forgotten in a couple of weeks? Especially now, when all the information is on your phone. You have people walking around with all the knowledge of humanity on their phone, but they have no idea how to integrate it. We don’t train people in thinking or reasoning.” Doctors and scientists frequently are not even trained in the basic underlying logic of their own tools. In 2013, a group of doctors and scientists gave physicians and medical students affiliated with Harvard and Boston University a type of problem that appears constantly in medicine: If a test to detect a disease whose prevalence is 1/1000 has a false positive rate of 5%, what is the chance that a person found to have a positive result actually has the disease, assuming you know nothing about the person’s symptoms or signs? The correct answer is that there is about a 2 percent chance (1.96 to be exact) that the patient actually has the disease. Only a quarter of the physicians and physicians-in-training got it right. The most common answer was 95 percent. It should be a very simple problem for professionals who rely on diagnostic tests for a living: in a sample of 10,000 people, 10 have the disease and get a true positive result; 5 percent, or 500, will get a false positive; out of 510 people who test positive, only 10, or 1.96 percent, are actually sick. The problem is not intuitive, but nor is it difficult.

New collaborations allow creators “to take ideas that are conventions in one area and bring them into a new area, where they’re suddenly seen as invention,” said sociologist Brian Uzzi, Amaral’s collaborator. Human creativity, he said, is basically an “ import/export business of ideas.”

Consistent with the import/export model, scientists who have worked abroad – whether or not they returned – are more likely to make a greater scientific impact than those who have not. The economists who documented that trend suggested one reason could be migrants’ “arbitrage” opportunities, the chance to take an idea from one market and bring it to another where it is more rare and valued.

To recap: work that builds bridges between disparate pieces of knowledge is less likely to be funded, less likely to appear in famous journals, more likely to be ignored upon publication, and then more likely in the long run to be a smash hit in the library of human knowledge.

“I always advise my people to read outside your field, everyday something. And most people say, ‘Well, I don’t have time to read outside my field.’ I say, ‘No, you do have time, it’s far more important.’ Your world becomes a bigger world, and maybe there’s a moment in which you make connections.”

“When I went to medical school, I was taught that there were no human diseases caused by retroviruses, that retroviruses were a curiosity that occurred in some animal tumors. In 1981, a new disease emerges that nobody knows anything about. In 1984, it’s found to be a retrovirus, HIV. In 1987, you have the first therapy. In 1996, you have such effective therapy that people don’t have to die of it anymore. How did that happen? Was it because companies all of a sudden rushed to make drugs? No. If you really look back and analyze it, before that time society had spent some of its very hard-earned money to study a curiosity called retroviruses. Just a curiosity in animals. So by the time HIV was found to be a retrovirus, you already knew that if you interfered with the protease [a type of enzyme] that you could deactivate it. So when HIV arrived, society had right off the shelf a huge amount of knowledge from investments made in a curiosity that at the time had no use. It may very well be that if you were to take all the research funding in the country and you put it in Alzheimer’s disease, you would never get to the solution. But the answer to Alzheimer’s disease may come from a misfolding protein in a cucumber. But how are you going to write a grant on a cucumber? And who are you going to send it to? If somebody gets interested in a folding protein in a cucumber and it’s a good scientific question, leave them alone. Let them torture the cucumber.”

Casadevall told me. “What’s gone totally is that time to talk and synthesize. People grab lunch and bring it into their offices. They feel lunch is inefficient, but often that’s the best time to bounce ideas and make connections.”

CONCLUSION: Expanding Your Range

Experimentation is not a tidy prescription, but it is common, and it has advantages, and it requires more than the typical motivational-poster lip service to a tolerance for failure. Breakthroughs are high variance. Creativity researcher Dean Keith Simonton has shown that the more work eminent creators produced, the more duds they churned out, and the higher their chances of a supernova success. Thomas Edison held more than a thousand patents, most completely unimportant, and was rejected for many more. His failures were legion, but his successes – the mass-market light bulb, the phonograph, a precursor to the film projector – were earthshaking.

It doesn’t mean breakthrough creation is luck, although that helps, but rather that it is hard and inconsistent. Going where no one has is a wicked problem. There is no well-defined formula or perfect system of feedback to follow. It’s like the stock market that way; if you want the sky highs, you have to tolerate a lot of lows.

Compare yourself to yourself yesterday, not to younger people who aren’t you. Everyone progresses at a different rate, so don’t let anyone else make you feel behind. You probably don’t even know where exactly you’re going, so feeling behind doesn’t help. Instead, as Herminia Ibarra suggested for the proactive pursuit of match quality, start planning experiments.