The Sin of Representativeness

In Thinking: Fast and Slow, Kahneman explains the representativeness heuristic, by giving us the example of Tom W, a fictional graduate student. Assume that you know nothing about Tom W, and you were asked to guess which major he is most likely in. I will simplify the example and include only three of these majors. 

Your choices were: 

  • Computer Science 
  • Library Science 
  • Humanities and Education 

Knowing nothing else, we would answer this question by thinking of the base rate. We would ask, which major has the highest number of students enrolled in it?

We know that there are more students in Humanities and Education than in Computer Science or Library Science. So we would assume that Tom W is a Humanities and Education major. But what if we changed the question a little? 

Now, we are given a personality sketch written by a psychologist who conducted a series of psychological tests on Tom W. 

Tom W is of high intelligence, although lacking in true creativity. He has a need for order and clarity, and for neat and tidy systems in which every detail finds its appropriate place. His writing is rather dull and mechanical, occasionally enlivened by somewhat corny puns and flashes of imagination of the sci-fi type. He has a strong drive for competence. He seems to have little feel and little sympathy for other people and does not enjoy interacting with others. Self-centered, he nonetheless has a deep moral sense. 

Thinking: Fast and Slow, Daniel Kahneman

When graduate students in psychology were given this question, they re-evaluated their response. Now, they ranked computer science as the best fitting major because the description given about Tom W fits the stereotype well. The psychological description above fits not just computer science, but narrow fields such as library science or engineering. The description of Tom W was a smaller fit for larger fields like the humanities and education. 

But that is the error of representativeness. Even graduate students in psychology who were trained to not ignore base rates, fell for the sin of representativeness. They placed full emphasis on the stereotypical description and neglected the the probability that a randomly selected student is in any of these majors. 

Michael Lewis wrote a bestselling book, Moneyball, that tells the true story of Billy Beane, the manager of the Oakland A’s, and it demonstrates the problem of using representativeness as a factor in making predictions. 

Traditionally, professional baseball scouts would forecast the success of possible players partly by their build and look, in addition to age.  But Billy Beane made the unpopular decision to overrule his scouts and recruit players by the statistics of past performance. The players that A’s picked were inexpensive because other teams rejected them for not looking the part. 

Contrary to most people’s expectations, the team that looked like the underdog (but wasn’t) outperformed the forecasts made for them. They achieved great results at a low cost. 

That is not to say, that representativeness or stereotyping doesn’t work.

For example, people who act friendly are in fact friendly. A professional athlete who is very tall and thin is more likely to play basketball than football. A PhD is more likely to read The New York Times than someone who only has a high school education. Young men are more likely than old women to drive aggressively. 

But in other cases, the representativeness heuristic can be misleading. The more disciplined way to think about these problems is to apply Bayesian reasoning. There are two things you should do. 

  1. Anchor your judgement of the probability of an outcome on a plausible base rate
  2. Question the completeness of your evidence 


If you estimate that Tom W’s psychological description made him 4 times more likely to be a computer science major, and 3 percent of graduate students were in computer science, then you should estimate that there is an 11 percent chance that he is a computer science graduate. If the base rate was 80 percent, then there is a 94.1 percent chance that he is a computer science graduate.

If you want detailed mathematical explanations for how Bayes theorem works, Khan Academy is a great resource. The more important message is: don’t ignore base rates. 
 
A couple of examples can illustrate how this works in real life. 

“The start-up looks as if it could not fail, but the base rate of success in the industry is extremely low. How do we know this case is different?” 

“The lawn is well-trimmed, the receptionist looks competent, and the furniture is attractive, but this doesn’t mean it’s a well-managed company. I hope the board doesn’t go by representativeness.” 

If you were on a subway, and you saw a woman reading The New York Times, what is more likely? That she has a PhD, or that she does not have a college degree?

"A gilded No is more satisfactory than a dry yes" - Gracian